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DNA POLYMERASE III HOLOENZYME 



Research support which led to the making of the present 
invention was provided in part by funding frorri the National Institutes 
5 of Health under Grant No. GM-38839. Accordingly, the federal 

government has certain statutory rights to the Invention described 
herein under 35 U.S.'c! 200 et seq. 

The present application for Letters Patent Is a Continuation-in- 
Part dt my earlier United States Patent Application 07/826,926, filed 
10 J&rtUary 24th 1992., said Continuation-in-Part having been filed as 
International Patent Application PCT US93/00627 on January 22nd 
1993. 

In 1982, Arthur kornberg was the first to purify DNA polymerase 
III holdenzyrhe (holoenzyme) and determine that it was the principal 
1 5 polymerase that replicated the E. coli chromosome. 

In common with chromosomal replicases of phages T4 and 17, 
yeast, Drosophila. mammals and their viruses, the E. coli holoenzyme 
contains at least ten subunits in all ( a, e, 9, x, x, 8, &,%> V. B ) [ see J. Biol 
Ghem, 257:11468 (1982)]. It has been proposed that chromosomal 
20 replicases may contain a dimeric polymerase in order to replicate both 
the leading and lagging strands concurrently. Indeed the 1 MDa size of 
the holoenzyme and apparent equal stoichiometry of its subunits 
(except 13 which is twice the abundance of the others) is evidence that 
the holoenzyme has the following dimeric composition: 
25 (ae9)2t2(Y65'xv)2B4. 

One of the features of the holoenzyme which distinguish it as a 
chromosomal replicase is its use of ATP to form a tight, gel filterable, 
"initiation complex" on primed DNA. The holoenzyme initiation complex 
completely replicates A uniquely primed bacteriophage single-strand 
3 0 DNA (ssDNA) genome coated with the ssDNA binding protein (SSB), at a 
speed of at least 500 nucleotides per second (at 30°C) Without 
dissociating from an 8.6 kb circular DNA even once. This remarkable 
processivity (nucleotides polymerized in one template binding event) 
and catalytic speed is irt keeping with the rate of replication fork 
3 5 movement in E. coli (1kb/second at 37°C). In comparison, DNA 




polymerase I as Well as the T4 polymerase, Taq polymerase, and T7 
polymerase (sequence) are all very slow (10-20 nucleotides) and lack 
high processivity (10-12 nucleotides per binding event). With these 
distinctive features the polylll holoenzyme has commercial application. 

5 However; there is a go6d reason it has not yet been applied 

commercially. Namely, there are only a teW (10-20) molecules of 
polylll holoenzyme per cell and thus it is difficult to purify; only a few 
tenths of a milligram bah be obtained from 1000 liters of ceils; and It 
can not be simply overproduced by genetic engineering because it is 

10 composed of 10 different subunits. 

The subunits of DNA polymerase III holoenzyme are set forth in 

the following table: 
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Gene 


Subunit . 


Mass (kda) . 


Functions 


a 


130 


DNA polymerase 




e 


27 


Proofreading 3'-5" exonuclease 




e 


10 






t 


71 


Dimerizes core, DNA-dependent 
ATPase 




y 


47 


Binds ATP 


holA 


s 


35 


Interact with g to transfer 6 to 






DNA 


holB 


8' 


33 


DNA-dependent ATPase with g 


holC 


X 


15 




holD 
holE 


V 
6 


12 
40 


Sliding clamp on DNA, binds core 



* As discovered in making the present invention, the 8' is a mixture 

of two proteins, both encoded by the same holB gene, and therefore it 
friay be. regarded as two subunits of the holoenzyme, thus bringing the 

3 0 total number of subunits in the holoenzyme to eleven. 

The genes tor 5 of the holoenzyme's subunits have been identified 
[see Nucleic Acids Research 14(20): 8091 (1986); Gene 28:159 (1984); 
PNAS (USA) 80:7137 (1982); J. of Bacteriology 169(12): 5735(1987); 
and J. of Bacteriology 158:455 (1984)]. These 5 genes have been cloned 

3 5 and overproducing expression plasmids for these 5 subunits are 

commercially available. However, prior to the present invention, the 
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identification for the remaining 5 subunits which make up the 
holoenzyme was not known. 

The present invention describes, for the first time, the genetic 
and peptide sequences for the remaining five subunits of the 
polymerase II! hoioenzyrtte. In addition, to sequence these genes, very 
efficient operproducirig plasrtilds for each of them have been 
constructed, and purification protocols for each have been devised. 
Whereas the low amount of holoenzyme in cells has allowed the 
subassemblies to be available in microgram quantities prior to the 
present invention (milligrams of pure a, e, t, 5 and 6 subunits are 
available using molecular cloned genes in overproducing expression 
plasrnids), utilizing techniques according to the present invention it has 
been possible to obtain approximately 100 mg of homogeneous subunit 

from 4 L of cells. 

Prior to the identification of the remaining 5 genes of the 
holoenzyme, a few micrograms of each subunit Was resolved from the 
holoenzyme. The Sequence analysis of these resolved subunits 
eventually lead to the identification of their genetic sequences, and 
then to the gehes per se. In addition, reconstitution studies were 
carried out to determine which subunits were essential to the speed 
and processivity of the. holoenzyme. In addition, overproducing 
expression plasrnids for these 5 subunits were produced. 

Following these studies, it has now been determined, according 
to the present invention, that at least 5 subunits are required for the 
action of this enzyhie (a, e, B, 5, and y), and preferably 6 subunits are 
essential for the speed and processivity of the holoenzyme. These 
subunits, the combination of which are essential tor the unique 
synthetic capabilities of the holoenzyme, according to the present 
invention, are: a, e, B, 8, 8', and y. 

The 5 subunits according to the present invention which have 
been identified, sequenced, cloned, provided in overproducing 
expression plasrnids, expressed, and purified for the first time are 
subunits 8,8', X.e, and V. 
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The following figureSj detailed description and examples are 
provided In order td allow the reader to obtain a fuller and more 
complete understanding of the present invention. With regard to the 
figures, 

5 Fig. 1 depicts the pET-8 expression vector according to the 

present invention; , , 

Fig. 2 depicts thd pET-8' expression vector according td the 

present invention; 

Fig. 3Ai B, and C depict the replication activity of 8, 8' and 88' 
1 0 with y and x according to the present invention; 

Fig. 4A and B depict the effect of 8' and 8 on the ATPase activity 
of y and t according to the present invention; 

Fig. 5 depicts the pET-8 expression plasmid according to the 
present invention; 

1 5 Fig. 6A and B depict the reconstitution assay according to the 

present invention indicating that 9 does not stimulate DNA synthesis; 

Fig. 7 depicts that 6, according to the present invention 
stimulates e In excision of an incorrect 3' TG base pair; 

Fig. 8A and B depicts the native molecular weight of ae and pollll 
20 core according to the present invention; 

Fig. 9 depicts the construction of the pET-y overproducing 
plasmid according to the present invention; 

Fig. 10A and B depict the stimulation of the DNA dependent 
ATPase of y and t by y and x, according to the present invention; 

2 5 Fig. 11 depicts the construction of the pET-% expression plasmid 

according to the present invention; and 

Fig. 12A ahd B depicts native molecular mass of x. vand the xv 
complex, according to the present invention. 

More specifically With regard to figure 1, there is shown the 

3 0 expression vector for 8 as prepared and described in the following 

examples. The holA gene excised from Ml3-8-Ndel using Ndel is shown 
above the pET3c vector. The open reading frame encoding 8 is inserted 
into the Ndel site of pET3c such that the initiating ATG is positioned 
downstream of the Shine-Dalgarno sequence and a T7 promoter. 
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Downstream of the holA insert are 492 nucleotides of E. coli DNA and 
591 nucleotides of Ml3rhp18 DNA. The T7 RNA polymerase termination 
sequence is downstream - of the holA insert. 

More specifically With regard to figure 2, the holB fragment 
5 excised from Ml3-8'-Ndel Using Ndel Is shown above the expression 

vector. The open reading frame encoding 8' Is Inserted into the Ndel site 
of pET3c such that the Initiating ATG is positioned downstream of the 
Shihe-Dalgarno sequence" ahd a T7 promoter. The holB Insert also 
contains 158 nucleotides of E. coli DNA downstream of the the holB stop 

1 0 codori to an Ndel site, the T7 polymerase termination sequence is 
downstream of the holB insert. 

With regard to figure 3, replication assays were performed as 
described below. Figure 3C summaries the replication assays. Either y 
Or t was titrated into assays containing SSB "coated" primed Ml3mp18 

1 5 ssDNA, J, ae and either 2 hg 8, 2 ng 8' or an equal mixture (1 ng each) of 8 
and 8' (88'). The reaction Mixture was preincubated for 8 minutes to 
allow re constitution of the processive polymerase prior to initiating a 
20 second pulse of DNA synthesis. Figure 3A depicts the results of the 
y Subunit being titrated into the replication mixture either alone (open 

20 squares) or containing either 8' (closed circles), 8 (open circles), or 88" 
(closed squares). Figure 3B depicts the results of the y subunit being 
titrated into the replication mixture either alone (open triangles), or 
containing either 8' (closed circles), 8 (open circles), or 88' (closed 
squares). 

25 With regard to figure 4, ATPase assays were performed in the 

presence of Ml3mp18 ssDNA as described in detail below. The subunits 
in each assay are identified below the plot in the figure. Figure 4A 
refers to the effect of 8, 8' and B on the yATPase; figure 4A refers to 
the effect of 8, 8' arid B on the tATPase. 

3 o With regard to figure 5, the shaded Ndel-BamHI segment includes 

the holE gene (arrow). Transcription of the holE is driven by a T7 
prorhoter. The T7 RNA polymerase termination sequence is downstream 
tronl the E. coli DNA Insert. Translation of 8 is aided by an upstream 
Shihe-Dalgarno sequence. 
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With regard to, figure 6, the replication reactions were performed 
as described below. Figure 6A outlines the protocol summarizing the 
assay. Either the cie complex or reconstituted pollll core (<xe9) were 
titrated into the assay which contains B, y complex and primed phage X 
5 174 ssDNA "coated" with SSB. Proteins and DNA Were preincubated for 
6 minutes to allow |lme for assembly of the processive polymerase. A 
15 second round df synthesis was initiated Upon addition of remaining 
dedxynucleoside triphosphates. Circles: titration of ae complex; 
triangles: titration of ae9. The a subunit Was limiting in these assays 
1 0 and therefore the amount of ae and aeB added to the assay is taken as 
the amount of a added. 

With regard to figure 7, there is depicted the results of a 
titration of 9 into the assay containing e and a mismatched 3' 32P-end- 
labelled T residue on a synthetic "hooked" primer template. 

1 5 With regard to figure 8, the is shown a comparison of the 

migration of <xe arid pollll core relative to protein standards on gel 
filtration and in glycerol gradients. The position of pollll core 
recohstituted using either excess or substoichioriietric 9 was the same 
in both types of analysis. Figure 8A depicts ge! filtration analysis on 
20 Superose 12. The Stokes radius of protein standards Was calculated 
from their known diffusion coefficients. Figure 8B depicts glycerol 
gradient sedimentation analysis. Sedimentation coefficients of the 
standards are Amy, sweet potato B-amylase (152 kDa, 8.9S); Apf. horse 
apoferritin (467 kDa, 59.5A); IgG; bovine immunoglobulin G (158 kDa, 

2 5 52 3A, 7.4S); BSA, bovine serum albumin (67 kDa, 34.9A, 4.41 S); Ova, 

chicken ovalbumin (43.5 kDa, 27.5A, 3.6S); Myo, horse myoglobin (17.5 
kDa, 19.0A, 2.0S). The positions of ae and pollll core relative to the 
protein standards are indicated In the plots. The Stokes radii and S 
values of ae and pollll were measured by comparison to the standards. 

3 0 With regard to figure 9, the holD gene was amplified from 

genomic DNA using primers Which form an Ndel site at the initiating 
ATG and downstream BamHI site. Due to an internal Ndel site ^within 
holD, insertion of the complete holD gene into the pET3c expression 
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plasmid required the two steps shown below. Additional details appear 
in the following description. 

With regard to figure 10, ATPase assays Were performed using a 
two-fold molar excess of x and V mbnoriiers) over y and i (as dimers) 
5 and using Ml3mp18 ssDNA aS ah effector. Figure 1oA depicts ATPase 
assays of v, x. y arid cbmblhation of these proteins; figure 10B depicts 
the effect of y and % sUbUnits on the ATPase of i. Subunits in the 
assays are indicated beldw the plots, and assays performed in the 
presence of SSB are indicated. 
10 With regard to figure 11, the holC gene Was amplified from 

genomic DNA using primers which generate an Ndel site at the start - 
cbdbh of holC and a BamHI site 152 nucleotides downstream of holC as 
described below. The 604 bp amplified product Was purified, digested 
with Ndel and BarnHI, and ligated Into the Ndel and BamHI Sites of pET3c 
1 5 to yield pET- X . The open reading frame encoding x was inserted into the 
Ndel-BamHI sites of pET3c such that the initiating ATG is positioned 
downstream of the Shine-Dalgarno sequence and a T7 promoter. The T7 
RNA polymerase termination sequence is downstream of the holC insert, 
the Amp r indicates the ampicillin resistance gene; the ori indicates the 
20 pB322 origin of replication. 

With regard to figure 12A, the Stokes radius of x. y and X V 
complex was determined by comparison with protein standards in gel 
filtration on Superdex 75. With regard to figure 12B. the S value of X . V 
and XV complex detefrniried by comparison to protein standards in 
25 glycerol gradient analysis are given. The protein standards were: 
bovine serum albuttiih (BSA), 34.9A. 4.41 S; chicken ovalbumin (Ova) 
27.5A, 3.6S; soybean trypsin inhibitor (Ti). 23.8A; bovine carbonic 
ahhydrase II (Carb). 3.06S; horse myobglobin (Myo). 19.0A. 2.0S; and 
horse kidney metalothionin (Met), 1.75S. 
3 0 In general, the sequence for each of the genes for the five 

subunit peptides, according to the present Invention, began with 
Isolating, purifying and sequencing the individual peptides. ^ 

The 6. 8', X, V subunits Were purified by a combination of two 
published procedures. First the y complex ( y, 8, 8\ v. was purified from 



1.5 Kg E coli HB101 .(pNT203-pSK100) as described by Maki [see J. Bio. 

Chem 263:6555(1 988)]* Second, the complex was split into two 

fractions - "ayxv" complex and a "88'" complex - as described by 

O'Dohnell [see J. Bio. Chem 265:1179 (1990)]. the* peptide sequences for 

5 6 arid 8' were obtained from the 88' fraction; whereas the peptide 

sequences of % and xjf are Obtained from the wi fraction. The e subunit 

sequence Was obtained from a side fraction off this procedure which 

contained nearly pure polymerase III ( a, e, 9 ) subunits. 

For all 5 prbteiris, the amino acid sequences were obtained in the 

1 0 same manner, by the USe of N-terminal analysis and tryptic analysis. 

N-terrrtinal analysis was conducted using known techniques of SDS-RAG 

electrophoresis [see Nature 227:680(1970)] in a 15% gel, and 

subsequent electrocution onto PVDF membrane. The resolved peptides 

were removed from the membrane and sequenced. For tryptic analysis, 

1 5 either 88' or yxV corn pies Was chromatographed In a 15% SDS-PAG gel to 

separate the individual subunits. However, for this procedure, the 100 

prhol was electroblotted onto nitrocellulose. The nitrocellulose 

membrane was theh digested with trypsin, and the peptides resolved by 

microbore HPLC. The resolved peptides were then sequenced. 

20 The electroblotting procedure used In the tryptic analysis 

protocol is more fully described in the following general example: 

EXAMPLE I 
(electroblotting) 

SDS (Bio-Rad) was purified by crystallization from ethanol- 

25 water, SDS (100 g) was added to ethanol (450 g), stirred, and heated to 

55°C. Additional hot Water Was added (50-75 ml) until all of the SDS 

dissolved. Activated charcoal (10 g) was added to the solution, and 

after 10 minutes the slurry was filtered through a Buchner funnel 

(Whatman No. 5 paper) to remove all the charcoal. The filtered solution 

3 0 was chilled, first a 4°C for 24 hrs and then at -20°C for an additional 

24 hrs. Crystalline SDS was collected on a coarse-frit sintered-glass 

funnel and washed with 800 ml of ethanol chilled to -20°C. The partial 

pUnfied SDS was then recrystallized using the above procedure but 

without the charcoal. 0.75 mm SDS-Laemmli gel was made using ultra- 



pure reagents. Prior to electrophoresis 10 mM Glutathione (to a final 
concentration of 0.05 mM) was added to the upper chamber buffer, and 
the system preelectrophoresed 2 hr at 3-5 mA (3 friA for mini-gel. 5 mA 
for normal). After 2 hrs, the upper chamber was emptied and standard 
tris-glycine buffer was added. The samples to be run were denatured 
usirig Laemmli denaturaton solution made from the Ultra-pUre reagents 
(in the presence of 5 mM DTT). The gel was run under conditions such 
that Separation was acheived In less than 2 hrs. After the gel run, the 
gel was soaked for 30 min in 10 mM CAPS pH 11, 5% methanol (% of 
0 methanol will vary depending on the size of the protein: in general, high 
molecular weight proteins transfer more efficiently in absence of 
methanol while low molecular weight proteins require methanol in the 
buffer). CAPS buffer was made by titrating a 10 mM solution With 10 N 
NaOH. For gel transfer, slices of Immobilon were wet in 100% methanol 
5 and equilibrated 10 miri in the CAPS transfer buffer, and the protein 
transferred using Bio-Rad mini blotter (transfer time will vary 
depending on protein size, methanol, etc.; -70 kDa polypeptide Will 
transfer in 90 min in the presence of 5% methanol at 15V). After 
transfer, Immobilori WaS soaked In distilled water for 5 min, and the 
0 membrane Was stained With 0.1% Commassie Blue R250 in 50% 

methanol for 1 min, and destained in 50% methanol and 10% aldehyde- 
free acetic acid. The membranes were soaked in distilled Water for 10 
min, and allowed to air dry. Protein bands of interest were cut from 
the membrane and stored in eppendorf tubes at -20°C until sequenced. 
5 The identification of the subunit gene of DNA polymerase III 5 

was accomplished by purifying the 8 8' proteins to apparent homogeneity 
through an ATP-agarose column from 1.3 kg of the 8/t overproducing 
strain of E. coli [HB 101 (pNT 203, pSK 100)]. 

The 8 8' subuhits were separated by electrophoresis in a 15% SDS- 
0 PAG (polyacrylamlde gel), then electroblotted onto PVDF membrane 
(Whatman) for N-terminal sequencing (50 pmol each) (see J. Biol. Chem. 
262:10035 (1987)] , and onto nitrocellulose membrane (Schleicher and 
Schuell) for tryptic analysis (140 pmol each) (see PNAs USA 84:6970 
(1987)]. Proteins were visualized by Ponceau S (Sigma). Protein 
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sequences were determined using an Applied Biosystems 470A gas- 
phase microsequenator. Sequence rusults were as follows: 

N-terminal sequence: 
NH2-Met Leu Arg Leu Tyr Pro Glu Gin Leu Arg Ala Gin Leu Asn 
5 5 10 

Glu Gly Leu Arg Ala Ala Tyr Leu Leu Leu Gly Asn Asp Pro; 
15 i . 20 25 

tryptic peptide 8-i : 
NH2~Ala Ala Tyr Leu Leu Leu Gly Asn Asp Pro leu Leu Leu Gin 
10 5 10 

Glu Ser Gin Asp Ala Val Arg; 
15 20 . . . 

tryptic peptide 8-2: 
NH2-Ala Gin Glu Asn Ala Ala Trp Phe Thr Ala Leu Ala Asn Arg 
15 5 10 

tryptic peptide 8-3: 
NH2-Val Glu Gin Ala Val Asn Asp Ala Ala His Phe Thr Pro Phe 

5 10 
His Trp Val Asp Ala Leu Leu Met (Gly) (Lys) . 
20 15 20 

Paranthesis in the above sequence indicate uncertainty in 

the amino acid assignment. 

The DNA sequencing, construction of the overproducing vector, 
and DNA replication assays for this subunit were conducted according 

2 5. to the following example: 

EXAMPLE II 

DNA sequencing: 

The 3.2 kb Kpnl/Bg1ll (restriction enzymes, New England Biolabs) 
fragment containing 5 Was excised from X169 (Kohara) and directionally 

3 0 ligated into pUCl8 to yield pUCdelta. Both strands of DNA were 

sequenced by the chain termination method of Sanger using the United 
States Biochemicals sequenase kit, [a- 35 S]dCTP (New England Nuclear), 
and synthetic DNA 17-rriers (Oligos etc. Inc.). All sequence information 
presented here was determined on both strands using both dGTP and 
3 5 dlTF> in sequencing reactions. 

Construction of the overproducing vector: 



Approximately 1.7 kb of DNA upstream of 8 Was excised from 
pUCdelta using Kphl (polylinker site) and BstXi (the BstXI site is 13 
base pairs upstream of the start codon of holA) followed by self- 
iigatiort of the plasmid. A 1.5 kb fragment containing the holA gene was 
5 then excised Using EcoRI and Xbal (these sites are in the pUC polylinker 
ort either side of the 8 insert) followed by directional ligation Into 
M13mpl8 to yield Ml3delta\ An Ndel site was generated at the start 
codon of ho/A by primer directed mutagenesis [see Methods Enzymol • 
154:367 (1987)] using a DNA 33-mer (5'->3'): 
i 0 GmCAAOCGA ATCMMGTT ACCCAGCGAG CTC 33 

containing the Ndel site (underlined) at the start codon of holA to 
prime replication of Ml3delta viral ssDNA, and using DNA polymerase 
and SSB in place of Klenow polymerase to completely replicate the 
circle without strand displacement [see J. Biol. Cherri. 260:12884 
15 (1985)]. The Ndel site was verified by DNA sequencing. An Ndel 
fragment (2.1 kb) containing the 8 gene was excised from the Ndel 
mutated M13 delta and ligated Into pET-3c linearized using Ndel to 
yield pETdelta. The orientation of the holA gene in pETdelta was 
determined by sequencing. 
20 DNA replication assays: 

The replication assay contained 72 ng M13mp18 ssDNA (0.03 pmol 
as circles) uniquely primed with a DNA 30-mer (see J. Biol. Chem. 
266:11328 (1991)], 980 ng SSB (13.6 pmol as tetramer), 22 ng 6(0.29 
pmol as dimer), 200 ng y{2.i pmol as tetramer), ,55 ng a e complex in 
25 a final volume (after addition of proteins) of 25 p\ 20 mM Tris-HCL 
(pH?,5), 8 mM MgCl2. 5 mM DTT, 4% glycerol, 40 jig/ml BSA, 0.5 mM 
ATP, 60 uM each dCTP, dGTP, dATP and 20 uM [ a-3 2 P]dTTP (New 
England Nuclear). Proteins used in the reconstitution assay were 
purified to apparent horrigeniety and their concentration determined. 
3 0 Delta protein or column fraction containing 8, was diluted in buffer (20 
mM Tris-HCL (pH7.5), 2 mM DTT, 0.5 mM EDTA, 20% glycerol, 60 mM NaCI 
and 50 ng/ml BSA) such that 1-10 ng of protein was added to the assay 
on ice, shifted to 37°C for 5 minutes, then quenched Upon spotting 
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directly onto DE81 filter paper. DNA synthesis was quantitated as 

described. 

Gel filtration: 

Gel filtration of 8, 8' and the 88' complex Was performed using an 
5 HR 10/30 Superdex 75 column equilibrated In 20rtlM Tris-HCL (pH 7.5). 
10% glycerol, 2 mM DTt. 0.5 mM EDTA and 100 tnM (buffer B). Either 8 
(30 ug. 0.78 nmol as "monomer), 8' (30 tig, 0.81 nrtibl as monomer) or a 
mixture of 8 and 8' (30 |lg of each) were incubated tor 30 minutes at 
15°C in 100 ul of buffer B then the entire 100 ul sample was injected 
1 0 onto the column, the column was developed with buffer B at a flow 

rate of 0.3 ml/minute and after the first 6 ml. fractions of 170 ul were 
collected. Fractions were analyzed by 13% SDS polyacrylamide gels 
(100 ul per lane) stained With Coomassie Brillaht Blue. Densitometry 
of stained gels Was performed Using a Pharmacla-LKB Ultrascan XL 

1 5 laser densitometer. 

Gel filtration of y or y mixed with either 8 or 8' or both 8 and 8' 
was performed using an HR 10/30 Superose 12 column equilabrated in 
buffer B. Protein mixtures were preincubated 30 minutes at 15°C in 
100ul buffer B then injected onto the column and the column was 
20 developed and analyzed as described above. Replication activity assays 
of these column fractions were performed as described above with the 
following modifications. The y subunit was omitted from the assay and 
each fraction was diluted 50-fold with 20 mM Tris-HCL (pH 7.5), 10% 
glycerol, 2 mM DTT, 0.5 mM EDTA and 50 ug/ml BSA. Then 2 ul of 

2 5 diluted fraction was Withdrawn and added to the assay. 

Protein standards Were a mixture of proteins obtained from 
BioRad and from Sigma Chemical Co. and were brought to a 
concentration of approximately 50 ug each in 100ul buffer B before 
analysis on either Superdex 75 or Superose 12 columns. 

3 0 Glycerol gradient sedimentation: 

Sedimentation analysis of 8, 8' and a mixture of 8 and 8' were 
performed using 11.6 ml 10%-30% glycerol gradients in buffer B. Either 
8 (57 ug, 1.5 nmol as monomer), 8* (56 ug, 1.5 nmol as monomer) or a 
mixture of 8 and 8' (57 ug and 56 ug, respectively) were incubated at 
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15°C for 30 minutes in a final volume of 100 pi buffer B then each 
sample was layered onto a separate gradient. Protein standards (50 ug 
each In 100 ul buffer B) were layered onto another gradient and the 
gradients Were centrituged at 270,000 X g for 60 hours at 4°C. 
Fractions of 170 UJ were collected from the bottom of the tube and 
analyzed (106 ul/iarje) Ih a 13% SDS polyacrylamlde gel stained with 
Gbblflassie Blue. 

Light scattering: , _ , 

The diffusion coefficient of 8, 8' and the 88' complex was 
determined by dynamic light scattering at 780nm in a fixed angle (90°) 
Blotage model dp-801 light scattering instrument (Oros Instruments). 
Samples of 8 (200 ug, 5.2 nmol as monomer), and 8' (200 ug. 5.4 nmol as 
monomer) were in 400 Ul of 20 mM Tris-HCL (pH 7.5), 1 00 mM NaC. and 
1 2% glycerol. The mixture of 8 and 8' (100 ug of each) was in 400 ul of 
20 mM TriS-HCI (pH 7.5) and 100 mM NaCl. The observed diffusion 
coefficient of 8' in the presence of 1.2% glycerol was 0.6% higher than 
ih the absence of glycerol. Hence, the 1.2% glycerol in the 8 and 8' 
samples had little effe.ct on the observed diffusion coefficient. 

The purification of 8 was preformed according to the following 

0 example: 

EXAMPLE III 
(purification of 8) 
BL21 (DE3) cells harboring pETdelta Were grown at 37°C in 12 
liters of LB media cohtaining 100 ug/ml of ampicillin. Upon growth to 
5 OD 1 5, the temperature was lowered to 25°C, and IPTG was added to 0.4 
mM After a further 3 hrs. of growth, the cells (50 g) were collected by 
cSntrifugation. Cells were lysed using lysozyme as described in pnor 
publications, and the debris removed by centrifUgation. The following 
purification steps were performed at 4°C. The assay for 8 is as 

0 described above. 

The clarified cell lysate (300 ml) was diluted 2-fold w.th 20 mM 
Tris-HCI (pH 7.5), 20% glycerol, 0.5 mM EDTA, 2 mM DTT (buffer A) to a 
conductivity equal to 112 mM NaCl. and then loaded (over 3 hrs.) onto a 
60 ml Hexylamine Sepharose column equilibrated With buffer A plus 0,1 
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M NaCI. The Hexylamine column was washed with 60 ml buffer A plus 
0 1 M NaCI, and then eluted (over 14 hrs) using a 600 ml linear gradient 
of 0 1 M NaCI to 0.5 M NaCI In buffer A. Eighty fractions were collected. 
Fractions 16-34 (125 mis) were dialed against 2 liters of buffer A 
plus 90 mM NaCI overnight, and then diluted 2-fold with buffer A to 
yiSld a conductivity equal to 65 mM NaCI just prior to loading (over 45 
niin) onto a 60 mi column of Heparin Sepharose equilibrated in buffer A 
plus 50 mM NaCI. the Heparin column was washed with 120 m buffer A 
plus 50 mM NaCI, and then eluted (over 14 hrs) using a 600 ml ..near 
gradient of 0.05 M NaCI to 0.5 M NaCI in buffer A. Eighty fmct,ons were 
collected. Fractlohs 24-34 were pooled and diluted 3^™^ - 
volume of 250 mis) with buffer A to a conductivity equal to 85 mM NaCI 
just prior to loading (over 50 min) onto a 50 ml Hi-Load 26/10 Q 
Sepharose fast flow FPLC column. The column was washed with 150 ml 
buffer A plus 50 mM NaCI, and then eluted using a 600 ml linear 
gradient of 0.05 M NaCI to 0.5 mM NaCI in buffer A. Eighty fractions 
were collected. Fractions 28-36 Which contained pure 8 Were pooled 
(74 nils 1.9 mg/mDi passed over a 1 ml ATP-agarose column (N-6 
LdJVremove any possible .complex contaminant, and then dia.yzed 
versus two changes of 2 liters each of buffer A containing 0.1 M NaCI 
(the DTT was ommitted for the purpose ot determining prote.n 
concentration spectrophotometries) before storing at -70°C. 

The following table gives the results obtained from measunng 
the protein levels obtained from the fractions taken in Example III. 

V~Fractions total total specific fold % 

5 Fractions ^ ^.^ y|e|d 

( m g) (units/mg) tion 



r~Usaie2 T070 5.4x1 0 9 2.6x10^ 1.0 100 

0 Hex 3 Line 446 2.5x1 0 9 5.6x10 2.2 

III Heparin 197 2.0x109 10.2x106 3.9 37 

,VQ Sepharose 141 i.5x1 0 9 10 .6x1 0 6 4.1 28 
1 0 ne unit is defined as pmol nucleotide incorporated per minute 
2 0 mmission of gamma from the assay of the ^ resulted in a 
5 200-fold reduction of specific activity (un.ts/mg) 
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The 8 gene was Identified using amino acid sequence information 
from 8. The sequence of the N-terminal 28 amino acids of 8 and the 
sequence of three internal tryptic peptides were determined. One of the 
tryptic peptides (tryptic peptide 6-1) overlapped 10 amino acids of the 
5 N-terminal sequence. A search of the GenBank revealed a sequence 
which predicted the ,exact amino acid sequence of the 21 amino acid 
tryptic peptide 8-1 whlptl overlapped the N-terminal sequence. The 
matching sequence occurred just downstream of the rlpB gene at 15.2 
minutes of the E coli chromosome. The match of the published DNA 
1 0 sequence to the N-termihal sequence of 8 was imperfect due to a few 
errors in the published sequence of this region. The published sequence 
of rlpB accounted tor approximately 22% of the 8 gene and did not 
encode either of the other two tryptic fragments. The Kohara lamda 
phage 169 contains 19 kb of DNA surrounding the 8 gene. The 3.2 kb 
15 Kphl/Bg1ll fragment containing 8 was excised from M69. cloned into 
pUC 18 and the 8 gene was sequenced. The DNA sequence predicts the 
correct N-terminal sequence of 8 (except the lie Instead of Leu at 
position 2) and encodes the other two Internal tryptic peptides of 8 in 
the same reading frame, and predicts a 343 amino acid protein of 38.7 
20 kda consistent with the mobility of the 8 in SDS-PAGE (35 kDa). 

The full nucleic acid sequence for the 8 gene according to the 
present invention was determined to be: 
~""ATG ATT CGG TTG TAC CCG GAA CAA CTC CGC GCG CAG CTC 39 
AAT GAA GGG CTG CGC rra OTR TAT CTT TTA CTT GOT AAC 78 
25 r.a T rrr nr, TT^ ttc hag GAA AGC TAG GAC OCT GTT CGT 117 
CAG GTA GCT GCG GCA CAA GGA TTC GAA GAA CAC CAC ACT 156 
TTT TCC ATT GAT CCC AAC ACT GAC TGG AAT GCG ATC TTT 195 
TCG TTA TGC CAG GCT ATG AGT CTG TTT GCC AGT CGA CAA 234 
ACG CTA TTG CTG TTG TTA OCA GAA AAC GGA CCG AAT GCG 273 
30 GCG ATC AAT GAG CAA CTT CTC ACA CTC ACC GGA CTT CTG 312 
CAT GAC GAC CTG CTG TTG ATC GTC CGC GGT AAT AAA TTA 351 
^ atve err. PAA GAA AAT G OT . GCC TGG TTT ACT GCG CTT 390 
GCG AAT CGC AGC GTG CAG GTG ACC TGT CAG ACA CCG GAG 429 
CAG GCT CAG CTT CCC CGC TGG GTT GCT GCG CGC GCA AAA 468 
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CAG CTC AAC TTA GAA, CTG GAT GAC GCG GCA AAT CAG GTG 507 
CTC TGC TAC TGT TAT GAA GGT AAC CTG CTG GCG CTG GCT 546 
CAG GCA CTG GAG CGT TTA TCG CTG CTC TGG CCA GAC GGC 585 
AAA TTG ACA TTA CCG CGC CTT GAA CAG GCG GTG AAT GAT 624 
grr arc, cat tvt &cc. nrrr ttt CAT TGG GTT GAT GCT ITS 663 
TTC ATG GGA AAA AGT. AAG CGC GCA TTG CAT ATT CTT CAG 702 
CAA CTG CGT CTG GAA .GGC AGC GAA CCG GTT ATT TTG TTG 741 
CGC ACA TTA CAA CGT GAA CTG TTG TTA CTG GTT AAC CTG 780 
AAA CGC CAG TCT GCC CAT ACG CCA CTG CGT GCG TTG TTT 819 
GAT AAG CAT CGG GTA TGG CAG AAC CGC CGG GGC ATG ATG 858 
GGC GAG GCG TTA AAT CGC TTA AGT CAG ACG CAG TTA CGT 897 
CAG GCC GTG CAA CTC CTG ACA CGA ACG GAA CTC ACC CTC 936 
AAA CAA GAT TAC GGT CAG TCA GTG TGG GCA GAG CTG GAA 975 
GGG TTA TCT CTT CTG TTG TGC CAT AAA CCC CTG GCG GAC 1014 
GTA TTT ATC GAC GGT TGA 1032 

— The underlined portions of this sequence refer to subunits which 
are 5 -1 (55-117), 8 -2 (358-399). and 6 -3 (604-672). In addition, the 
upstream sequence: 

CCGAACAGCT GATTCGTAAG CTGOCA AGCA TCCGTGCTGC GGATATJCGT 50 
TCCGACGAAG AACAGACGTC GACCACAACG GATACTCCGG CAACGCCTGC 100 
ACGCGTCTCC ACCAQ3CIGG GTAACTQ 127 
""wherein the last underlined TG denotes two-thirds of rlpB stop codon; 
in addition, the positive RNA polymerase promoter signals (TCGCCA and 
GATATT) and the Shine-Dalgarno sequence (ACGCt) are underlined. 
In addition, the downstream nucleic acid sequence for holA 

—^zEndate.'is as fellows- 52 




~~ le^mAAATCT TtACAGGCTC TGTTTGGCGG CACCTTTGAT CCGGTGCACT ~. 
^ ATGGTCATCT AAAACCCGTT GGAAGCGTGG (XGAAGTTTT GATTGGTCTG AC ^ ,U ^ 

The holA gene translatesinto the amino acid sequence: 
0 Met lie Arg Leu Tyr Pro Glu Gin Leu Arg Ala Gin Leu Asn Glu 

5 10 15 

Glv Leu Arq Ala Ala Tyr Leu Leu Leu Gly Asn Asp Pro Leu Leu 
20 25 -30 

Leu Gin Glu Ser Gin Asp Ala Val Arg Gin Val Ala Ala Ala Gin 
< 35 40 45 



n 
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Glv Phe Glu Glu His - His Thr Phe Ser He Asp Pro Asn Thr Asp 
50 55 60 

Trp Asn Ala He Phe Ser Leu Cys Gin Ala Met Ser Leu Phe Ala 
65 70 75 

Ser Arg Gin Thr Leu Leu Leu Leu Leu Pro Glu Asn Gly Pro Asn 
80 85 90 

Ala Ala He Asn Glu Gin Leu Leu Thr Leu Thr Gly Leu Leu His 
95' 100 105 

Asp Asp Leu Leu Leu 'He Val Arg Gly Asn Lys Leu Ser Lys Ala 
110* H5 120 

Gin Glu Asn Ala Ala Trp Phe Thr Ala Leu Ala Asn Arg Ser Val 
125 130 135 

Gin Val Thr Cys Gin Thr Pro Glu Gin Ala Gin Leu Pro Arg Trp 
140 145 150 

Val Ala Ala Arg Ala Lys Gin Leu Asn Leu Glu Leu Asp Asp Ala 
155 160 165 

Ala Asn Gin Val Leu Cys Tyr Cys Tyr Glu Gly Asn Leu Leu Asn 
170 . 175 180 

Leu Ala Gin Ala Leu Glu Arg Leu Ser Leu leu Trp Pro Asp Gly 
185 190 195 

Lvs Leu Thr Leu Pro Arg Val Glu Gin Ala Val Asn Asp Ala Ala 
200 205 210 

His Phe Thr Pro Phe His Trp Val Asp Ala Leu Leu Met Gly Lys 
215 220 225 

Ser Lys Arg Ala Leu His He Leu Gin Gin Leu Arg Leu Gly Gly 
230 235 240 

Ser Glu Pro Val He Leu Leu Arg Thr Leu Gin Arg Glu Leu Leu 
245 250 255 

Leu Leu Val Ash Leu Lys Arg Gin Ser Ala His Thr Pro Leu Arg 
260 265 270 

Ala Leu Phe Asp Lys His Arg Val Trp Gin Asn Arg Arg Gly Met 
275 280 285 

Met Gly Glu Ala Leu Asn Arg Leu Ser Gin Thr Gin Leu Arg Gin 
290 295 300 

Ala Val Gin Leu Leu Thr Arg Thr Glu Leu Thr Leu Lys Gin Asp 
305 - 310 315 

Tvr Glv Gin Ser Val Trp Ala Glu Leu Glu Gly Leu Ser Leu Leu 
320 325 330 

Leu Cys His Lys Pro Leu Ala Asp Val Phe He Asp Gly 
335 340 343 

The holA gene is located in an area of the chromosome containing 
several membrane protein genes. They are all transcribed In the same 
direction. The mrdA and mrdB genes encode proteins responsible for the 




rod shape of E. coll and the rlpA and ripB genes encode rare 
lipoproteins which are speculated to be important to cell duplication. 
The position of the 8 gene within a cluster of membrane proteins may be 
coincidental or may be related to the putative attachment of the 
5 replisome to the rrierhbrahe. 

As noted, the.'terrnthation codoh of the ripB protein overlaps one 
nucleotide with the inflating ATG of holA ieavin g a gap of only 2 
nucleotides between these genes. holA may be &ri operoh with ripB or 
them may be a promoter within ripB. The nearest possible initiation 
i 0 signals for transcription (the putative RNA polymerase signals) and 

translation (Shine-balgarho) are underlined In the sequence given above; 
the match td their respective consensus sequences is not strong 
suggesting a low utilization efficiency. Inefficient transcription 
artd/df translation may be expected for a gene encoding a subunit of a 
15 hbloehzyme present at only 10-20 copies/cell. The 8 gene uses several 
rare codons (CCC(Pro). ACA(Thr), GGA(Gly). AGT(Ser). AAT(Asn), TTA, 
tTGj CTC(all Leu)] 2-5 times more frequently thah average which may 
decrease translation efficiency. ATP binding to 8 within the holenzyme 
has been detected previously by UV crosslinking. The DNA sequence of 
20 the 8 shows a near match to the ATP binding site consensus motif (i.e. 
AX3GKS for 8 at residues 219-225 compared to the published consensus 
G/AX4GKSfl", G/AXGKS/T or G/AX2GXGKS/T [see Nuc. Acids Res 17:8413 
(1989)]. Whether 8 binds With ATP specifically at this site remains to 
be determined. Of the 33 arginine and lysine residues in 8, 16 of them 
25 (50%) are within amino acids 225-307. This same region contains only 
5 (14%) of the 35 glutamic and aspartic acid residues. Whether this 
concentration of basic residues is significant to function is unknown. 
There are no strong matches to consensus sequences to motifs encoding: 
zinc fingers or helix-loop-helix DNA binding domains. 
3 0 The holA gene was cloned Into Ml3mp18 and an Ndel site was 

created at the initiating methionine by the Site directed mutagenesis 
technique in order to study the overproduction of this gene. The 8 gene 
Was then excised from M13delta and inserted into the Ndel site of the 
pET-3c expression vector (see Methods Enzymol 185:60 (1990)] which 
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places Sunder control. of a strong T7 RNA polymerase promoter , see Fig 
3-1. Upon transformation Into BL21(DE3) cells and induction of T7 RNA 
polymerase with IPTG, the 8 protein was expressed to 27% total cell 
protein. For reasons unknown, 8 was not produced in BL21(DE3) cells 
5 containing the pLysS plasmid. Induction at 25°C yielded approximately 
2-fold more 8 ahd increased the solubility of the overproduced 8 
relative to induction at 37»C. Twelve liters of Induced cells were 
lysed using iysozyme and 141 mg of pure 8 was obtained in 28% overall 
yield upon column fractionation using Hexylamine Sepharose, Heparin 
1 0 Agarose, and Q Sepharose. Delta protein tended to precipitate upon 
standing In low salt (<70 mM), especially during dialysis. Therefore, 
low salt was avoided except for short periods of time and column 
fractions containing 8 were sometimes diluted in preparation for the 
next column rather than dialyzed overnight. The 8 subunit was assayed 
15 by its ability to reconstitute efficient replication of a singly primed 
Mi3rnp18 ssDNA "coated" with SSB in the presence of a, e, p, and y 
subuhits. Cell lysate prepared from induced cells containing pETdelta 
were more active in the replication assay than cell lysate prepared 
from induced cells containing the pET-3c vector. 
20 The expressed 8 protein comigrated with the authentic 8 subunit 

contained within the y complex of the holoenzyme. The N-terminal 
sequence analysis of the pure cloned 8 was identical to that predicted 
from the holA sequence according to the present invention provided that 
the protein encoded by the gene had been purified. Furthermore, the 

2 5 overproduced 8 subunit was active with only the a,e.Y and B subunits 

of the holenzyme (fig 5-1). In the presence of a sixth subunit, 8\ 
activity was enhanced. The amount of the cloned 8 required to 
reconstitute the efficient DNA synthesis characteristic of the 
holoenzyme Using the 5 or 6 subunit combination according to the 

3 0 present invention is in the range shown previously for the naturally 

purified 8 resolved from the Y complex. As shown below, addition of 
more y to the replication assay brings the amount of 8 doWn even further 
to about 1ng for a stoichiometry of about 1-2 8 monomers per DNA 
circle replicated. . 
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Electrospray mass spectrometry of the cloned 8 protein yielded a 
molecular mass of 38,704 da. This mass is within 0.0015 of the mass 
predicted from the gene; well within the 0.01% error of the mass 
spectrometry techhique. This is evidence that the DNA sequence above 
5 according to the present invention contains no errors and indicates the 
overproduced 8 is no* modified during or after translation. . The e280 
calculated from the arriiho acid composition of 8 is 46,230 M^cm" 1 . 
the measured absbrbahce of 8 In 6M guahidine hydrochloride is only 0.2% 
higher than in buffer A. Hence, the e280 of native 8 Is 46,137 M" 1 cm" 1 . 
10 Further understanding of the individual subunits the present 

invention also determines whether 8 and 8' are monomeric, dimeric or 
higher order structures. The 8 and 8' subunits were also each analyzed 
In a gel filtration column, and they migrated in essentially the same 
position as one another (fractions 30-32). As discussed below the 6' 
1 5 appears as two proteins, 8'|_ and 8's. which differ by approximately 0.5 
kda. Comparison with protein standards of known Stokes radius yielded 
a Stokes radius of 26.5A for 8 and 25.8A for 8', slightly smaller than the 
27.5A radius of the 43.5 kDa ovalbumin standard indicating both 8 and 8' 
are both monomeric (their gene sequences predict: 8, 38.7 kDa 8', 36.9 
20 kba). In a glycerol gradient sedimentation analysis both 8 and 8' 

migrated in the same position as one another with an S value of 3.0 
relative to protein standards, a slightly lower sedimentation value than 
the 43.5kda ovalbumin standard, again indicating a monomeric state for 
the 8 and 8'. Besides protein mass, the protein shape is also a 
25 determinant of' both the Stokes radius and the S value obtained by these 
techniques. The shape however, causes opposite behavior in these two 
techniques, a protein with an asymmetric shape behaves in gel 
filtration as a larger protein than if it were spherical (elutes early) and 
behaves in sedimentation like a smaller protein than if it Were 
30 spherical (sediments slower). The Stokes radius and S value can be 
combined in the equation of Siegel and Monty whereupon the protein 
Shape factor cancels. Therefore, the native mass of the protein 
obtained from such treatment is more accurate than calculating the 
mass from only the S value or the Stokes radius and assuming a 
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spherical shape. This calculation yielded a native mass of 34.7 kDa for 
8 and 33.8 kDa for 8"; values similar to the monomer molecular mass 
predicted from the gene sequences of 8 and 8', further evidence they are 
monomers. Their frictonal coefficients are each significantly greater 
5 than 1.0 indicating they are not spherical but have some asymmetry to 
their shape. One can, also conclude from this Work that the two 8' 
subuhits are a mixture ,of 81 and 8's rather than a complex of 8'L and 8's. 

In initial studies - using the cloned 8, 8 forms only a Weak complex 
with y but, together with 8' a stable 788" complex can be reconstituted 
10 which remains intact in gel filtration and ion exchange chromatography. 
n Likewise, 8' forms only a weak complex with y, and requires the 6 

5 subunit to bind y tightly. Both 8 and 8' appear monomeric and bind to 

5 each other to form a 88' heterodimer. 

j| Availability of the 8 subunit in large quantity will allow detailed 

W 1 5 studies of the mechanism of the y complex in 13 clamp formation. 

JH ; Further, identification of the 8 gene will provide for genetic analysis 

f (essentiality) of 8 in E. coll replication and possibly other roles of 8 in 

0 DNA metabolism. 

W Tne second subunit according to the present invention, that of 8'. 

% 20 was also identified from the 88' fraction in like manner. The N-terminal 

1 sequence, comprising the first 18 amino acids in the peptide, and the 
% l tryptic peptide sequence were obtained. The amino acid sequence 

determined from the initial sequence studies for the 8" peptide is: 
^ Arg Trp Tyr Pro Trp Leu Arg Pro Asp Phe Glu Lys Leu Val 

25 * 5 10 15 , 

Ala Ser Tyr Gin Ala Gly Arg Gly His His Ala Leu Leu He Gin 
20 25 30 

Ala Leu Pro Gly Met Gly Asp Asp Ala Leu He Tyr Ala Leu Ser 
S>^/} / )U 35 40 45 

\/Ul^- 30 -ryr Leu Leu Cys Gin Gin Pro Gin Gly His lys Ser Cys Gly 

50 55 60 

His Cys Arg Gly Cys Gin Leu Met Gin Ala Gly Thr His Pro Asp 
65 70 75 

Tvr Tvr Thr Leu Ala Pro Glu Lys Gly Lys Asn Thr Leu Gly Val 
3 5 80 85 90 

Asd Ala Val Arg Glu Val Thr Glu Lys Leu Asn Glu His Ala Arg 
P 95 100 105 
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Leu Gly Gly Ala Lys. Val Val Trp Val Thr Asp Ala Ala Leu Leu 
110 115 120 

Thr Asp Ala Ala Ala Asn Ala Leu Leu Lys Thr Leu Glu Glu Pro 
125 130 135 

Pro Ala Glu Thr Trp Phe Phe Leu Ala Thr Arg Glu Pro Glu Arg 
140 145 150 

Leu Leu Ala Thr LeU Arg Ser Arg Cys Arg Lsu His Tyr Leu Ala 
155' 160 165 

Pro Pro Pro Glu Gin 'Tyr Ala Val Thr Trp Leu Ser Arg Glu Val 
170' 175 180 



Thr Met Ser Gin Asp Ala Leu Leu Ala Ala Leu Arg Leu Ser Ala 
185 190 195 

Glv Ser Pro Gly Ala Ala Leu Ala Leu Phe Gin Gly Asp Asn Trp 
200 205 210 

1 5 Gin Ala Arg Glu Thr Leu Cys Gin Ala Leu Ala Tyr Ser Val Pro 

215 220 225 

Ser Gly Asp Trp Tyr Ser Leu Leu Ala Ala Leu Asn His Glu Gin 
230 235 240 

Ala Pro Ala Arg Leu His Trp Leu Ala Thr Leu Leu Met Asp Ala 
20 245 250 255 

Leu Lys Arg His His Gly Ala Ala Gin Val Thr Asn Val Asp Val 
260 265 270 

Pro Gly Leu Val Ala Glu Leu Ala Asn His Leu Ser Pro Ser Arg 
* 275 280 285 

2 5 Leu Gin Ala lie Leu Gly Asp Val Cys His lie Arg Glu Gin Leu 

290 295 300 

Met Ser Val Thr Gly He Asn Arg Glu Leu Leu He Thr Asp Leu 

305 310 315 • 

Leu Leu Arg lie Glu His Tyr Leu Gin Pro Gly Val Val Leu Pro 
320 325 330 

Val Pro His Leu 
334 

From these sequences, two DNA oligonucleotide probes were 
made and used (after end-labelling with 32 P fo r use in Southern blot 

3 5 analysis) to probe a Southern blot of E. coli DNA which was grown, 

isolated and restricted as above. The sequences of the two probes 
were: 

probe 1: 

ACT CTG GAA GAA CCG CCG GCT GAA ACT TGG TTT TTT CTG OCT ^ 42 
40 ACT CGT GAA CCG GAA 57; and 
probe 2: 



30 




24 



GCT GGT TCT CCG QGT, GCT GCT CTG GCT CTG TTT CAG GGT GAT 42 

GAC TGG CAG GCT 54. 

Of the two Southern blots analyzed (one With the 57-mer probe 
and the other with the 54-mer probe), the patterns from the blots had 

5 one set of bands In common, and these were sized by comparison with 
size standards in the' .same gel following recognized techniques. The 
size Of these 8 cornmqri "bands" or DNA fragments produced by digestion 
With 8 restriction enzymes were used to scan, by eye. the restriction 
map of the E. coll genome [see Cell 50:495 (1987)]. One unique location 

1 0 oh the genome Was located which was compatable With all 8 restriction 

fragment sizes. 

Phage te36 was selected as a phage containing the "unique 
Idcattdh" in the E cdli genome. The 8' gene Was excised from the \236 
dhdQe using restriction enzymes EcoRV and Kpnl to yield a 2.3 kb 
i 5 fragment of DNA. this' fragment was then ligated Into pl)Cl8 and 

sequenced using a sequenase kit (US Biochemicals) in accordance with 
the manufacturer's Instructions. The fragment was also ligated into a 
M13rhp18 vector for making a site specific mutation, as described 
above, at the ATG start codon (i.e., changing the CGCATG to CATATG; 
20 thereby allowing Ndel to cleave the nucleotide at CATATG, Whereas it 
could noi cleave the nucleotide using the normal CGCATG sequence). 

The nucleic acid sequence obtained from these studies predicted 
the amino acid sequence determined for 8 1 peptide in frame, and thus the 
selected sequence was that for the 8' gene. The nucleic acid sequence, 
25 according to the present invention, for this second subunit, 8'. is: 
~~ATG AGA TGG TAT CCA TGG TTA CGA CCT GAT TTC GAA AAA 39 
CTG GTA GCC AGC TAT CAG GCC GGA AGA GGT CAC CAT GCG 78 
CTA CTC ATT CAG GCG TTA CCG GGC ATG GGC GAT GAT GCT 117 
TTA ATC TAC GCC CTG AGC OGT TAT TTA CTC TGC CAA CAA 156 
30 CCG CAG GGC CAC AAA AGT TGC GGT CAC TGT CGT GGA TGT 195 
CAG TTG ATG CAG GCT GGC ACG CAT GCC GAT TAC TAC ACC 234 
CTG GCT CCC GAA AAA GGA AAA AAT ACG CTG GGC GTT GAT 273 
GCG GTA CGT <* fi ^T- ACT. GAA AAG CTG AAT GAG CAC GCA 312 
car tta GGT r ^r cm AAA CTT f?TT TOG GTA ACC GAT GCT 351 





err, tta cta acc gac. one rra rrr aac gta TTG CTG AftA 390 
Am nvr gaa c,ag cta oca rra gaa act TOG TTT TTC CTG 429 
nrr Ann rac pap, cot gaa a ?r tta ctg cta aca m CGT 468 

ACT CGT TGT CGG TTA CAT TAC CTT GCG CCG CCG COG GAA 507 

pap; tac gcc gtg act, tgg ctt TTA CGC GAA GTG ACA ATG 546 

TCA CAG GAT GCA TTA, CTT GCC GCA TTG CGC TTA AGC GCC 585 

raTT TCR cot ran gcg oca e rr, am ttg ttt CAG GGA GAT 624 

pAC. TOG CAG GCT CGT GAA ACA TTG TGT CAG GCG TTG GCA 663 
TAT AGC GTG CCA TCG GGC GAT TGG TAT TCG CTG CTA GCG 702 
GCC CTT AAT CAT GAA CAA GTC CCG GCG CGT TTA CAC TGG 741 
CTG GCA ACG TTG CTG ATG GAT GCG CTA AAA CGC CAT CAT 780 
GGT GCT GCG CAG GTG ACC AAT GTT GAT GTG CCG GGC CTG 819 
GTC GCC GAA CTG GCA AAC CAT CTT TCT CCC TCG CGC CTG 858 
CAG GCT ATA CTG GGG GAT GTT TGC CAC ATT CGT GAA CAG 897 
TTA ATG TCT GTT ACA GGC ATC AAC CGC GAG CTT CTC ATC 936 
ACC GAT CTT TTA CTG CGT ATT GAG CAT TAC CTG CAA CCG 975 
GGC GTT GTG CTA CCG GTT CCT CAT CTT 1002 

~~" The underlined portions of this sequence refer to subunits which 
dre 5'-1 (283-315), 8'-2 (316-327), 5'-3 (328-390), 8'-4 (391-462), 8'-5 
(481-534), and 8'-6 (577-639). In addition, the upstream sequence: 

"l^AGAATCTTT CGATTTCTTT AATCGCACCC GCGCCCGCTA TCTGGAACTG 50 
GCAGCACAAG ATAAAAGCAT TCATACCATT GATGCCACCC AGCCGCTGGA 100 
GGCCGTGATG GATGCAATCC GCACTACCGT GACCCACTGG GTGAAGGAGT 150 
TGGACGC 157 

"Contains an underlined putative translational signal: Shine-Dalgarno. 
In addition, the downstream nucleic acid sequence for 8' begins 
With a stop codoh: 

TTA GAGAGACATC ATGTTTTTAG TGGACTCACA CTGCCATCTC 43 
GATGGTCTGG ATTATGAATC TTTQCATAAG GACGTGGATG ACGTTCTGGC 93 
GAAAGCCGCC GCACGCGATG TGAMTTTTG TCTGGCAGTC GCCACAACAT 143 
-~" The 8' gene {holB) was then subcloned into Ml3mp18, and a Ndel 
site was created at the initiating codon as described above. The 8' gene 
was then excised from M13 using Ndel restriction enzyme and a second 
enzyme which cut downstream of 8', and the excised gene was subcloned 
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into the pET-3c overexpression plasmid using the same techniques 
described above. Following overexpression of the 8' protein, the protein 
was purified using a Fast flow Q - Heparin - Hexylamine techique as 
described herein. Ninety nig of 8' protein was obtained from 4 liters of 

5 cells. 

Further studies on the 8' gene were conducted to make certain 
thai the gene sequence obtained from these research was actually the 8' 
gen e and not some artifact. These studies showed that the gene 
sequence according to the present invention predicted all the peptide 
1 0 sequence information, that the cloned 8' gene comigrates with the 
naturally occuring gene on a 13% SDS-PAG gel, that the cloned 8* gene 
stimulates the 5 protein system as does the naturally occuring 8', and 
that 8" forms a 8'8 complex with 8 in a similar manner to that which 
occurs with the naturally occurring 8' and 8. 

1 5 With specific regard to the isolation and characterization of 8' 

and holB according to the present invention, the amino acid sequencing 
Was conducted using 8 and 8' subunits purified to apparent homogenicity 
through the ATP-agarose column step [see J. Biol. Chem. 265:1179 
(1990)] from 1.3 kg of the y/x overproducing strain of E. coli : 
20 HBlOl(pNT203, pSK100), [see J. Biol, Chem. 263:6555 (1988)]. The Sand 
8' subunits were separated on a 13% SDS polyacrylamide gel whereupon 
the 8' resolved into two bands. The slower and faster migrating 8' bands 
are referred to as 8'L (large) and 8's (small), respectively; 8's was 
approximately 2 times the abundance of 81. Both 81 and 8's were 

2 5 electroblotted onto PVDF membrane (Whatman) [see j. Biol. Chem. 

262:10035 (1987)] for N-terminal sequencing (50 pmol each of 81 and 
8's), and onto nitrocellulose membrane (Schleicher and Schuell) [see 
Proc. Natl. Acad. Sci. USA 84:6970 (1987)] for tryptic analysis (90 pmol 
of 81 and 180 pmol of 8's). Proteins were visualized by Ponceau S stain 

3 0 (Sigma). . 

Analysis of the more abundant 8's was as follows: the N-termmal 

sequence was: 

NH 2 -Met Arg Trp Tyr Pro Pro Leu (Arg)(Pro) Asp Phe Glu Lys Leu Val Ala 

5 10 15 

3 5 and the tryptic peptides were: 
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5'-1: 

NH2~Glu Val Thr Glu Lys Leu Asn Glu His Ala Arg; 

5 10 

8'-3: 

5 NH?-Val Val Trp Val Thr Asp Ala Ala Leu Leu Thr Asp 

5 10 
Ala Ala Ala Asn Ala' Leu Leu Lys 
15 20; 

8'-4: 

■ 1 0 NH2"Thr Leu Glu Glu Pro Pro Ala Glu Thr Trp Phe Phe Leu Ala 

-t/>> C 5 10 



Thr Arg Glu Pro (Glu) (Arg) Leu Leu Ala Thr (Leu); 
15 20 
8'-5: 

1 5 NH2-Leu His Tyr Leu Ala Pro Pro (Pro) Glu Gin Tyr Ala Val 

.5 10 
Thr (Trp) Leu Ser Arg; and 
15 

8'-6: 

20 NH2-Leu Ser Ala Gly Ser Pro Gly Ala Ala Leu Ala Leu Phe Gin 

5 10 
Gly Asp Asn Trp Gin . Ala Arg. 
15 20 

Sequence analysis of tryptic peptides of the less abundant 5'L 

25 were: 

8*-2: 

NH2~Leu Gly Gly Ala Lys; and 

5 

5'-7 (same as 8'-3): 
3 0 NH2-Val Val Trp Val Thr Asp Ala Ala Leu Leu Thr Asp 

5 10 
Ala Ala Ala Asn Ala Leu Leu Lys 
15 20; 



Parenthesis in the above sequences indicate uncertain 

3 5 assignments. 

Two synthetic oligonucleotide probes (DNA oligonucleotides, 
Oligos etc. Inc.) were designed from the sequence of two of the tryptic 
peptides and the codon usage of E. coli with allowance for a T-G 
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mispair at the wobble, position. A synthetic DNA 57-mer probe was 
based on the sequence of 8'-4 (amino acids 131-149): 

*~~Ala Cys Thr Cys Thr Gly Gly Ala Ala Gly Ala Ala Cys Cys Gly 

5 10 15 

5 Cys Cys Gly Gly Cys Thr Thr Gly Ala Ala Ala Cys Thr Thr Gly 

10 25 30 

Gly Thr Thr Thr Thr Thr Thr Cys Thr Gly Gly Cys Thr Ala Cys 
35, 40 45 

Thr Cys Gly Thr Gly -Ala Ala Cys Cys Gly Gly Ala Ala 
10 50 55 

(after Identification and sequencing of holB this probe was incorrect at 
11 positions). A DNA 54-mer probe was based on the sequence of 8'-6 
(amino acids 195-212): 
* Gly Cys Thr Gly Gly Thr Thr Cys Thr Cys Cys Gly Gly Gly Thr 
15 5 10 15 

Gly Cys Thr Gly Cys Thr Cys Thr Gly Gly Cys Thr Cys Thr Gly 
20 ' 25 30 

Thr Thr Thr Cys Ala Gly Gly Gly Thr Gly Ala Thr Ala Ala Cys 
35 40 45 

2 0 Thr Gly Gly Cys Ala Gly Gly Cys Thr 

50 " 

""""(after identification and sequencing of holB the probe was incorrect at 
9 positions. These probes (100 pmol each) were 5' end-iabelled with 
1u,M [y 32 P]ATP (radiohucleotides, Dupont-New England Nuclear) and 
25 polynucleotide kinase. E. coli genomic DNA (strain C600) Was 

extracted [see J. Mol. Bio. 3:208 (1961)] and restricted with either 
BamHI, Hindlll, EcoRI, EcoRV, Bgll, Kpni, Pstl or Pvull (DNA modification 
enzymes, New England Biolabs) and then each digest was 
electrophoresed |n a 0.8% native agarose gel followed by depurination 

3 0 (0.25 M HCI), denaturation (0.5 M NaCI) and then neutralized (1 M Tris, 2 

M NaCI, pH %.0) prior to transfer to Gene Screen Plus (DuPont-New 
England Nuclear) for Southern analysis using a Vacugene appartus 
(Pharmacia) in the presence of 2XSSC (0.3 M NaCI. 0.3M sodium acetate, 
pH 7.0). Conditions for hybridization and washing using these 
3 5 oligonucleotide probes were determined empirically and the desired 
results were obtained using a hybridization temperature of 42°C then 
washing with 2XSSC and 0.2% SDS at successively higher temperature 




29 



until evaluation by autoradiography showed a single band in each lane 
for the 57-mer, and two bands in each lane for the 54-mer (this 
occurred at 53°C for both probes). Although the 54-mer showed two 
bands in each lane, one band always matched the position of the band 
5 probed with the 57-mei. 

The 2.1 kb Kpr|l/EcoRV fragment containing holB Was excised 
from \ E9G1(236) [sd$ Cell 50:495(1987)] and directionally ligated into 
PUG 18 (Kpnl/Hincll) td yield pUC-8'. Both strands of DNA Were 
sequenced by the chain termination method of Sanger using the United 
10 States Biochemicals sequenase kit, [a- 35 S]dATP, and synthetic DNA 18- 
mers. 

A 2.1 kb Kpnl/Hindlll fragment containing the holB gene was 
excised from pUC-8' and directionally ligated into M13mpl8 to yield 
M13-5". An Ndel site was generated at the start codon of holB by 
15 oligonucleotide site directed mutagenesis [see Methods Enzymol 
154:367 (1987)] using a DNA 33-mer: 
"*~Gly Gly Thr Gly Ala Ala Gly Gly Ala Gly Thr Thr Gly Gly Ala 

5 10 15 

11 n y.^ Ala Thr Ala Thr Glv A la Gly Ala Thr Gly Gly Thr Ala Thr 

W 20 20 25 30 

Cys Cys Ala 

% '""containing the Ndel site (underlined) at the start codon of holB to prime 

Sj replication of M13-5' viral ssDNA and using SSB and DNA polymerase III 

holoenzyme (in place of DNA polymerase I) to replicate the circular 

2 5 template without strand displacement. The M13 chimera is called M13- 

6'-Ndel. And Ndel fragmeht (1160bp) containing the holB gene was 
excised from Ml3-8'-Ndel and ligated into pET3c, linearized using Ndel, 
to yield pET-8*. the orientation of the holB gene in pET-8' was 
determined by sequencing. 

3 0 Reconstitution assays contained 108ng M13mp18 ssDNA (0.05 

pmol as circles) uniquely primed with a DNA 30-mer (see J. Biol. Chem 
266:11328 (1991)], 1.5 u.g SSB (21 pmol as tetramer), 30ng 13 (0.39pmol 
as dimer), 22.5 ng ae complex (0.14 pmol), 20 ng y (0.12 pmol as dimer), 
2 ng 8 (0.5 pmol as monomer) and the indicated amount of 8' (or 1-5 ng 
3 5 of column fraction during purification) in 20 mM Tris-HCI (pH 7.5), 8 
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mM MgCl2, 5 mM DTT, 4% glycerol, 40 ug/ml BSA, 0.5 mM ATP, 60 u.M 
dGTP, and 0.1 mM EDTA in a final volume of 25 (after the addition of 
the remaining proteins). Assays of y or t activity with either 5, 8' or 88\ 
contained either 2 hg 8 (0.05 pmol as monomer), 2 ng 8' (0.05 pmol as 
monomer), or 1 ng (0.025 pmol) each of 8 and 8', and the Indicated 
amount of y or x. A|l proteins were added to the assay on led and then 
shifted to 37°C for 8. minutes to allow reconstitutiori of the pocessive 
polymerase on the primed ssDNA. DNA synthesis was initiated upon 
rapid addition of 60 uM dATP and 20 uM [o32p]TTP, then quenced after 
20 Seconds and quantitated using DE81 paper. When needed, proteins 
were diluted in 20 mM Tris-HCI (pH 7.5), 2 mM DTT, 0.5 mM EDTA. 20%- 
glycefol, and 50 ug/ml BSA. Proteins used in the ^constitution assays 
were purified [see J. Biol. Chem 266:9833 (1991). The concentration of 
6 and 8 were determined by absorbance using an e280 value if 17.900M" 
1 cnr 1 , and 46,137M-1cnr 1 , respectively. Concentrations of a, e, y, x 
and SSB were theri determined [see Anal. Biochem 72:248 (1976)] using 
BSA as a standard. The concentration of 8* was determined by 
absorbance using ah e280 value of 60,136 M^cnr"*. 

ATPase assays were performed in a final volume of 20 ul 
containing 20 mM Tris-HCI (pH 7.5), 8 mM MgCl2 and contained 285 ng 
M13mp18 ssDNA. ATPase assays of y, 8, 8* 88', yb and y8' With and without 
p contained 100uM [y 32 P) ATP and when present 376 ng y (4 pmol as 
dimer), 304 ng 8 (7 pmol as monomer). 296 ng 8' (8.0 pmol as monomer), 
and 320 ng p (4.2 pmol as dimer). Proteins were added on ice, shifted to 
37°C for 30 minutes, then 0.5 ml was spotted on a plastic backed thin 
layer of chromatography (TLC) sheet coated with Cel-300 
polyethyleneimlne (Brinkman Instruments Co.). To assay the more active 
ATPase activity of 788' and t, 300 uM ATP was used, less total protein 
ahd less time at 37°C inorder to assess the initial rate of reaction. 
Therefore, ATPase assays of 788, x, x8, x8" and x88* With and without p 
contained 300 mM [y _32 P] ATP and when present, 47 ng y(0.5 pmol as 
dimer), 71 ng x (0.5pmol as dimer), 38ng 8 (1pmol as monomer), 37 ng 8' 
(1 pmol as monomer) and 40 ng p (0.5 pmol as dimer. Proteins were 
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added on ice, shifted t to 37°C for 10 minutes, then analyzed by TLC as 
described above: 

TLC sheets were developed in 0.5 M lithium chloride, 1 M formic 
acid. An autoradiogram of the TLC chromatogram was used to visualize 
the tree phosphate at the solvent front and ATP at the origin which 
Were then cut from the TLC sheet and quantitated by liquid 
scintillation. The amount of ATP hydrolyzed was calculated as the 
percent of total radioactivity located at the solvent front (Pj) times 
the total moles of ATP added to the reaction. 

The results of the 8* studies appear below: 

The naturally purified 8' (resolved from the y complex) appears in, 
a 13% SDS polyacrylamide gel as two bands of approximately 37 kDa 
that differ in size by aboutl kDa. The larger protein (51) is 
approximately one half the abundance of the smaller one (8's). Both 81 
end 6's are believed encoded by the same gene as there was no 
noticeable difference in their HPLC profiles upon digestion with 
trypsin. In support of this, peptides from 8*s and 81 that had the same 
(retention time on HLPC analysis also had identical amino acid 
sequences (peptide 8'-7 from 8's and 8*-3 from 81 were identical). The 
N-terminus of 8's and five tryptic peptides of 8's and two tryptic 
peptides of 81 were sequenced. 

A search of the GenBank revealed no match to the N-terminal 
sequence or to any of the tryptic peptides from either 81 or 8'S- Two 
best-guess oligonucleotide probes (a 57-mer and a 54-mer) were 
designed from' tryptic peptides 8'-4 and 8'-6 based on the codon usage 
frequency in £ coli [see PNASUSA 80:687 (1983)]. The oligonucleotide 
probes were used in a Southern analysis of £ coli genomic DNA 
digested with each of the eight Kohara restriction map enzymes. 
Imposing the restraint, that the eight restriction fragments from the 
Southern analysis must overlap the holB gene, the Kohara map of the £ 
coli chromosome was searched and only one position of overlap at 24.3 
minutes (1,174 kb on the £ coli chromosome Starting from thrA) was 
found which satisfied the fragment sizes. The fragment sizes in the 
Kohara map and from the Southern analysis are given in the following 





table which depicts the correspondence of the observed size of genomic 
DNA restriction fragments with the Kohara restriction map of the E. 
coli chromosome in the region of 24 minutes. E. coli genomic DNA was 
digested with the restriction enzymes indicated. The size of the 
5 restriction fragments that were in common for both the 57-mer and 
54-rtier probes in the, Southern analysis and also the corresponding 
sizes of the restriction fragments on the Kohara restriction map of the 
E. coli chromosome at 24.5 minutes are listed below. 
* Restriction Size of restriction fragment (kb) 



1 0 enzvme Southern Kohara map 

Pstl 1.7 1.9 

Bgll 4.25 4.2 

Kpnl 6.6 6.4 

EcoRV 7.0 6.8 

15 Pvull 6.2 6.2 

EcoRI >15 16.2 

Hindlll >20 30 

BamHI >25 38 



The Kohara X phage E9G 1(236) contains 16.2 kb of DNA 

20 surrounding the putative holB gene. A 2.1 Kpnl/EcoRV fragment 

cbhtaining holB was excised from X E9GK236), cloned into pUC18 and 
sequenced. The sequence of the Kpnl/EcoRV fragment revealed an open 
reading frame of 1002 nucleotides which predicts a 334 amino acid 
protein of 36.9 kDa (predicted pi of 7.04), consistent with the mobility 

2 5 of 8' in a SDS polyacrylamide gel. The open reading frame encodes the 

N-terminal sequence and all six tryptic peptide sequences obtained 
from 8'L and 8'S- 

Analysis of the DNA sequence upstream of the open reading frame 
revealed a putative translation initiation signal (Shine-Dalgarno 

3 0 sequence) 8 nucleotides upstream of the ATG initiating codon. No 

obvious transcription initiation signals were detected upstream of the 
Initiation codon leaving open the possibility that holB is in an operon 
With an upstream gene(s). Alternatively, the transcription initiation 
signals may poorly match the consensus signals and thereby be 
3 5 unrecognizable, as a low level of transcription would not be unexpected 
for a gene encoding a subunit of the holoenzyme present at only 10-20 





copies/cell. The /?o/S, gene uses several rare codons [TTA (Leu), ACA 
(Thr) ( GGA (Gly), AGC* TCG (Ser)] 2-4 times more frequently than 
average which may decrease translation efficiency. 

The holB sequence contains a helix-turn-helix consensus motif 
5 (Ala/GlyX3GlyX5lle/Val) at Ala80Gly84Valgo although ability of 8* to 
bind DNA has yet tp t?e examined. There is also a possible leucine zipper 
(Leu7X6Leui4X6Gly2jX6Leu28) in the N-terminus although Gly 
interrupts the Leu pattern. The holB sequence does not contain 
consensus sequences for motifs encoding an ATP-binding site or a zinc 

10 finger. The molar extinction coefficient of 8' calculated from its 8 Trp 
and 11 Tyr residues is 59 f 600M- 1 crrr 1 which is only 0.9% lower than - • 
that observed in the presence of 6M guanidine hydrochloride for a native 
extinction coefficient of 60.136M" 1 cm*1 . 

To obtain the 8' subunit in large quantity, an expression plasmid 

15 was constructed. The holB gene was first cloned into Ml3mp18 

followed by site directed mutagenesis to create an Ndel site at the 
initiating methionine to allow precise subcloning of holB into the pET3c 
expression vector. The holB gene was excised from the M13-8'-Ndel 
mutant using Ndel followed by insertion into the Ndel site of the pET3c 

2 0 expression vector [se6 Methods Enzymol 185:60 (1990)] which places 

holB under the control .of the 17 RNA polymerase promotor of T7 gene 
10 and the efficient Shine-Dalgarno sequence of gene 10. The pET-8' 
construct was transformed into BL21(DE3)plysS cells which harbor a X 
lysogen containing the T7 RNA polymerase gene controlled by the lac 
25 UV5 promoter.' Upon induction of T7 RNA polymerase with IPTG, the 8' 
protein was expressed to 50% of total cell protein. Cell lysate prepared 
from the induced cells containing pET-8' was 5600-fold more active in 
the replication assays than cell lysate prepared from induced cells 
containing the pET3c vector as described below. 

3 0 Three hundred liters of BL21(DE3)plysS cells harboring pET-8' . 

were grown at 37°C in LB media supplemented with 5 mg/ml glucose, 
10 \ig/m\ thiamine, 50 ^ig/ml thymine containing 100 jig/ml ampicillin 
and 25 ng/ml chloramphenicol. Upon growth to an OD600 of 0-6, IPTG 
was added to 0.2 mM. After further growth for two hours the cells (940 
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g) were collected by .centrifugation, resuspended in an equal weight of 
50 mM Tris-HCI (pH 7.5), 10% sucrose (Tris-SucrosS) and stored at 
-70°C. 100 g of cells (30 liters of cell culture) Were thawed 
Whereupon they lysed (due to lysozyme produced by plysS) and to this 
5 was added 250 ml tris-Sucrose, DTT to 2 mM and 40 ml of 10x heat 
lysis buffer (50 mM, Tris-HCI (pH 7.5), 10% sucrofce, 0.3M spermidine, 1M 
NaGl). The cell debri,s Was removed by centrifugation to yield the cell 
lysate (Fraction I* 4.41 g in 325 ml). The purification steps that 
followed were performed at 4°C. The reconstitution activity assay for 
10 8* is as described previously. Ammonium sulphate (0.21 g/ml) was 

dissolved in the clarified cell lysate and stirred for 90 minutes. The ~ 
precipitated protein containing 8' was pelleted (Fraction ll t 1.58 g) and 
redissolved in 660 ml of 30 mM Hepes-NaOH (pH 7.2), 10% glycerol, 0.5 
mM EDTA, 2 mM DTT (buffer A) and dialyzed against two successive 

1 5 changes of 2 liters each of buffer A to a conductivity equal to 40 mM 

NaCI. The Fraction II Was loaded onto a 300 ml heparin agarose column 
(BioRad) equilibrated With buffer A. The heparin column was washed 
With 450 ml buffer A plus 20 mM NaCI, then eluted over a period of 14 
hours using a 2.5 liter linear gradient of 20 mM NaCI to 300 mM NaCI in 

2 0 buffer A. One hundrfed fractions were collected. Fractions 36-53 were 

pooled (Fraction III, 550 ml, 990 mg) and dialyzed twice against 2 
liters of 20 mM Tris-HCI (pH 7.5), 10% glycerol, 0.5 mM EDTA, 2 mM DTT 
(buffer B) to a conductivity equal to 60 mM NaCI. The Fraction III was 
loaded onto a 100 ml Q sepharose column (Pharmacia) equilabrated with 

2 5 buffer B. The loaded Q sepharose column was washed with 150 ml of 

buffer B plus 20 mM NaCI then eluted over a period of 12 hours using a 
1.2 liter linear gradient of 20 mM NaCI to 300mM NaCI in buffer B. 
Eighty fractions were collected. Fractions 34-56 were pooled (Fraction 
IV, 781 mg in 370 ml) and dialyzed twice against 2 liters each of buffer 

3 0 B to a conductivity equal to 60mM NaCI just prior to loading onto a 60 

ml EAH sepharose column (Pharmacia) that was equilibrated with buffer 
8, The loaded EAH sepharose column was washed With 60 ml of buffer B 
plu$ 40 mM NaCI then eluted over a period of 10 hours using a 720 ml 
linear gradient of 40 mM NaCI to 500 mM NaCI in buffer B. Eighty 
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fractions were collected. Fractions 18-30 (Fraction V, 732 mg in 130 

ml), which contained homogeneous 8' were pooled and dialyzed against 2 

L buffer B (lacking DTT to allow an absorbance measurement, see 

below) to conductivity of 40 mM NaCI. Fraction V was passed over a 5 

5 ml ATP-agarose column (Pharmacia, Type II, N-6 linked) to remove any y 

complex contaminapt .followed by addition of DTT to 2 mM and then was 

aliquoted and stored at -70°C. Protein concentration was determined 

Usittg BSA as a standard except at the last step In which concentration 

was determined by absorbance using e280=60,l36M- l cm- 1 . 

10 Step total total specific fold % 

protein units 1 activity purification yield 

.. (ma) (Wite/mo.) l 

I Lysate 2 - 3 4414 3.0x10* 7x10 6 1.0 100 

II Ammonium Sulfate 1584 2.5x10 10 16x10 6 2.3 83 
15 III Heparin 990 2.6x10*° 26x106 3.7 87 

0 IV QSepharose 781 2.6x10 10 33x10*3 4.7 87 

V EAH-Sepharose 4 732 2.5x10 1 ° 34x106 4.9 8 3 

1 One unit Is defined as pmol nucleotide incorporated in 20 seconds 

2 Lysate of BL21(DE3)plysS cells harboring the pET3c vector yielded a specific activity of 
20 1252 unlts/mg. 

3 Omission of y and 8 from the assay of the lysate resulted in a 7650-fold reduction of 
specific activity (915 units/mg). 

4 Usirig pure 8", omission of y from the assay gave no detectable synthesis under the 
conditions of the assay. 

25 " The purified overproduced 8' stimulated 78 30-fold in its action in 

reconstituting the processive holoenzyme from the ae polymerase and 
the p clamp accessory protein. In this assay the 8' is titrated into a 
reaction containing a low concentration of y and 8. and also contains the 
p subunit, ae polymerase and M13mp18 ssDNA primed with a synthetic 

3 0 olignucleotide and coated with SSB. The proteins were preincubated 
with the DNA for 8 minutes to allow time for the accessory proteins to 
form the preinitiation complex which contains the p clamp and for ae to 
bind the preinitiation complex. DNA synthesis is initiated upon addition 
of deoxyfibonucleoside triphosphates and the reaction is stopped after 

3 5 20 seconds which is sufficient time for the processive reconstituted 
polymerase to complete the circular DNA. Although a processive 
polymerase can be reconstituted without the 8* subunit, under the 
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conditions used in the present invention in which y and 8 are at low 
concentration, the 5' subunit stimulates the reaction greatly (30-fold). 
The 8' subunit saturated this assay at a level of approximately one 
molecule of 8' to one molecule of 8. 
5 Both the x and y subunits of the holoenzyme are encoded by the 

same gene (dnaX). /"The y subunit is formed as a result of d -1 frameshlft 
diirlrtg translation with, the result that y is dhly 2/3 the length of x due 
to &h earlier translation^ stop codort (within 2 codohs) in the -1 
reading frame. The activity of the y and x proteins in reconstituting the 
1 0 processive polymerase Was compared using either the 8, 8" or both 88' 
subunits in the presence of ae complex and p subunit (Rg.6A and 6B). In 
the absence of 8 and 8', the y subunit alone displays Insignificant 
activity in the reconstitution assay although when a large amount of y 
was present it had very little, but detectable, activity (Fig. 6A). The 8 

1 5 subunit provides y with activity in the reconstitution assay* but 5' does 

not provide y with activity. However, the cloned 8' subuhlti when 
present with 8, markedly stimulated the activity of the y and. 8 mixture 
such that maximal activity was achieved at much lower concentrations 
of added y. 

20 The t subunit alone, like y, was also essentially inactive in the 

reconstitution assay, although at very high amounts of x a slight, but 
reproducible amount of activity was observed, x is active with 8 in this 
assay although more x (50-fold) than y is needed for comparable 
activity. Previously it was observed that x was unlike y in that x was 

2 5 active with 8" in the reconstitution assay in the absence of any 8 

subunit (only x, 8' and a, e, p were needed). Consistent with these 
previous results, the 8' subunit is active with x in the absence of 8 
(similar to the activity of x and 8 in the absence of 8')- With both 8 and 
8' present, only a small amount of x subunit is required for maximal 

3 0 activity in the reconstitution assay. The activity of x88' parallels that 

of y&V and requires 500-fold less x for maximal activity than either x8 
or x8'. Hence, both the y subunit and the x subunit are highly active In 
this reconstitution assay when both 8 and 8' are present. 





The effect of the 5, 8' and p subunits on the DNA dependent ATPase 
activity of x was quite different fromn their effect oh y, the close 
relative of the x subunit. The t subunit, by itself, Is a much more active 
bNA dependent AtPase than y and, in fact turns over two times more 
5 ATP than the y58' complex. Unlike the y ATPase, the x ATPase was 
essentially unaffected, by p or by 8 with or without f3 or by 8' with or 
without p. However, like the y ATPase, the presence of both 8 and 8' 
stimulated the x ATPase, although the effect was only 4-fold compared 
to the 30-fold stimulation of y by 88'. Whereas p stimulated the y88' 

1 0 ATPase 3-fold, the p subunit did not stimulate the y88' ATPase at all, in 
tact p slightly inhibited \l, yet the x88' complex is as active as y88' in 
reconstituting a processlve polymerase with p and ae. 

The cloned 8' preparation appears as a doublet in a 13% SDS 
polyacrylamide gel and the two polypeptides are of the same size and 

15 molar ratio (2:1, lower band-to-upper band) as the 8' doublet purified 
from the y complex. Electrospray mass spectometry revealed that the 
smaller polypeptide (8's) was the size predicted from the gene sequence 
dhd the larger polypeptide (8'|_) was increased In size by 521 Da. The 
nature of the larger polypeptide is presently under investigation. 

10 Possibilities include m'RNA splicing, use of an upstream translational 
start signal, readthrough of the stop codon, translational frameshifting, 
and posttranslational modification. Whatever the mechanism which 
profuces 8'L it must be efficient since the highly overproduced 8' still 
produces the same level of 8'L relative to the 8's and 8'L within the 

2 5 holoenzyme. irrespective of how 8'l is synthesized, the fact remains 

that 8'L and 8's are different. Presumably they also have functional 
differences as in the case of the related y and x subUnits. Whereas x and 
y both appear to be within each holoenzyme molecule, it reamins to be 
shown whether the 8'L and 8's subhunits are on one or on different 

3 0 holoenzyme molecules. 

Sequence analysis of 8'L and 8's show they have identical N- 
termini proving 8'L is not derived from an alternate upstream ATG start 
site. Translational readthrough of the stop codon was considered as an 
explanation which would produce a protein containing 19 additional 



amino acids before the next stop codon in the open reading frame, but 
this would increase the size of 8' by 2130 Da, much larger than the 
Observed mass of 8'L. Treatment of 5' with calf intestinal and bacterial 
alkaline phosphatases did not effect the mobility of either 8's or 8'l 
suggesting that serine and threonine phosphorylation is not Involved in 
the formation of 8'l,; attachment of other groups remains a possibility. 
Hence, transiational frameshlfting (or jumping), covalent modification 
(Other than phosphate oh Ser or Thr) and mRNA splicing remain possible. 
It seems most pertinent to consider transiational frameshlfting 
0 aS a source of 8'L since such a mechanism has precedent In holoenzyme 
structure. The dnaX gene encoding the x subunit of the holoenzyme 
generates the y subunit by a transiational frameshift into the -1 
reading frame. If 8'l is produced by a -1 frameshift, the frameshift 
would have to occur upstream of the holB stop codon but not so far 
5 upstream that a -1 frameshift would produce a truncated protein due to 
running into an early -1 frame stop codon. Thus the -1 frameshift 
would have to occur at or after the last -1 frame stop codon near 
GIU320 after which translation Would proceed past the normal stop 
codon in the open reading frame to produce a protein which is 7 amino 
0 acids larger than that predicted by the open reading frame of holB. 

The y complex expends ATP energy to clamp the p subunit onto a 
primer and it is this p dimer clamp that tethers the ae polymerase to 
the template for rapid and highly processive DNA synthesis by the ae 
polymerase which is only efficient after the p subunit has been clamped 
5 onto the DNA by y complex action. A mixture of the y and 8 subunits is 
sufficient in this assay to clamp p onto DNA, however much more y and 8 
is heeded relative to the amount of y complex. The 8' subunit stimulates 
Y and 8 in this assay such that the amounts of y. 8 and 8' are nearly 
comparable with the amount of y complex that is required ( the X and y 
0 subunits give another 3-8 fold stimulation of activity at low 
concentrations of y88', as described in the accompanying report. 
Likewise, neither 8 or 8" have a large effect on the ATPase activity of y 
but addition of both 8 and 8' to y gives a 30-fold stimulation of the 
yAf Pase activity. The requirement of both 8 and 8' for efficient 
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replication activity and for maximal ATPase activity of y correlates 
with the physical studies In the accompanying report which show that 5 
and 8* form a complex and the 88' complex binds tightly to y, whereas 
when 8 and 8* are added separately with y they do not form a strong y8 or 
5 y8' complex. 

The x subunit t cphtains the sequence of the y subunit ( y is 
produced from t) plus, an extra domain of 212 amino acids Which binds 
to a and to DNA. 

A homology search of the translated GenBank indicated that the 
i 0 most homologous protein to 8' of the present invention was another E. 

colt protein, the ylx subunit(s) of DNA polymerase III holoenzyme. There 

is 27% identity and 44% similarity including conservative substitutions 

over the entire length 6f 8' and ylx. One particular region in 8' of 50 

amino acids (amino acids 110-159) is strikingly similar to ylx (amino 
15 acids 121-170) having 49% identity. A putative helix-turn-helix motif 

in ylx (Alail4X3GlyH8X5Leui24) is positioned just 19 residues 

dowhstream of the helix-tuhvhelix motif in 8\ 

The extent of sequence homology between 8* and the ylx subunitr 

is above the level required to speculate that they have similar three 
20 dimensional structures; when both 8*s and 8'L are taken into account, 

four of the eleven subunits within the holoenzyme, according to the 

present invention, may have similar structures. 

The interactions between 8 and 8' were also studied as part of the 

present invention. 

25 Equal amounts of 8 and 8* were incubated together for 30 minutes 

at 15°C and then analyzed by gel filtration and glycerol gradient 
sedimentation. Gel filtration analysis showed 8 and 8' subunits 
comlgrate , and elute approximately six-to-eight fractions earlier than 
either 8 or8' alone Indicating that they form a 88' complex. Comparison 

3 0 with protein standards yields a Stokes radius of 31. 1A. The 8 and 8' 
also comigrated during glycerol gradient analysis and sedimented 
faster than either 8 or 8' alone, again consistent With formation of .a 88' 
complex with an S value of 3.9S. Combining the S value and Stokes 
radius yields a native mass of 53 kDa for the 88* complex, most 
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consistent with the mass of a 1:1 complex of 818'1 (75.6 kDa) then of a 
higher order aggregate of 88\ Both 8*l and 8's are visible in the 88* 
complex indicating they are present as a mixture of 88'l and 88's. 
Formation of a trimeric 88'L8 , s complex is unlikely as the combined 
5 mass would be 113 kDa, twice the observed mass. However, if free 5 
and 8' were in a rapid equilibrium with the 88' complex theh the 
observed mass of {he .88' complex would be a weighted average of the 
amount of complex and amount of free subunits and therefore the 
possibility of a higher order aggregate such as a 88'l8's complex can not 

10 be rigorously excluded. 

Densitometry analysis of the Coomassie Blue stained gel yielded 
a molar ratio of 8:8' of 1.1:1.0, respectively ( the two 8' bands were 
considered together as one 8') further supporting the 51 5*1 composition. 
Different proteins may take up different amounts of Coomassie Blue 

1 5 stain and therefore molar rates determined by densitometry must be 
regarded as tentative. A dynamic light scattering analysis of 8, 8' and 
85' complex is also presented in the table below. 

The Stokes Radius and sedimentation coefficient of 8, 8* and 88' 
complex were determined from the gel filtration and glycerol gradient 

20 sedimentation analyses; and the native molecular mass and the 

frictional coefficient were calculated from the Stokes radius and S 
value. These calculations require the partial specific volume of 8 and 
8'; these volumes were calculated by summation of the partial specific 
volumes of the individual amino acids for each 8 and 8'. Molecular 

25 weights of 8, 8' and the 88' complex (assuming a composition of 818'1) 
^jvere calculated from the gene sequences of 8 and 8\ 

5 £ 651 
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Stokes radius 


26.5 


25.8 


31.1 


Sedimentation coefficient 


3.0 


3.0 


3.9 


Partial specific volume 


0.74 


0.74 


0.74 


Native mass (radius and S value) 


34,708 


33,791 


52,952 


Native mass (gene sequence) 


38,704 


36,934 


75,630 


Frictional coefficient 


1.22 


1.20 


1.25 


Diffusion coefficient (light scattering) 


7.60 


8.16 


6.6J 


Radius calculated (D) 


28.2 


26.3 


32.5 
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The diffusion coefficient obtained from the light scattering 
analysis can be used to calculate the Stokes radius and these values 
were within 6% of the Stokes radius of 6, 5' and 88' complex determined . 
in gel filtration. 

5 In the y complex, the y. 8 and 8' subuhits are bound together along 

With the x and V subunits. The activity analysis described herein 
indicates that y and 8, interact since both are necessary and sufficient 
to assemble the p clarnp onto DNA. Further, the 8* subunit stimulates 
the DNA dependent ATPase activity of y indicating that y and 8' Interact. 

1 0 The physical interaction between 8, 8' and y were examined using 

the gel filtration technique which detects tightly bound protein-protein 
complexex, but since components are not at equilibrium during gel 
filtration, weak protein complexes will dissociate. The y subunit 
(47kda) is larger than 8 and 8*. and is at least a dimer in Its native state 

1 5 with a large Stokes radius and quite an asymmetric shape (y runs as a 
trimer or tetramer in gel filtration and as a dimer In a glycerol 
gradient. The y was mixed with a 4-fold molar excess of 8 and 8' then 
gel filtered. A complex of y88' was formed as Indicated by the 
comlg ration of both the 8 and 8' subunits with y- The excess 88' complex 

20 eluted much later (fraction 40-46). Since 8 binds 8\ it is possible that 
only one, for example 8, binds y and the other (eg. 8') is part of the 
complex by virtue of binding 8 instead of directly interacting with y- To 
determine which subunit, 8 or 8", binds directly to y. the y subunit Was 
mixed with either 8 or 8' then gel filtered. The mixture of y and 8 

25 showed that y. and 8 did not form a gel filterable y8 complex as indicated 
by the absence of 8 in fractions 24-32 containing V- The mixture of y 
and 8' showed that 8' did not form a complex with y either as indicated 
by the absence of 8' in fractions containing y- Therefore both 8 and 8' 
must be present to form a gel filterable complex With y- Using pure 

3 0 cloned 8 no y8 complex in gel filtration (or in glycerol gradient analysis) 
was seen. 

The gel filtration column fractions of the y88' complex were 
analyzed for their activity in assembly of the p clamp on primed DNA. 
Fractions containing the y88' complex were quite active. The 88' 
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complex, even at high concentration, is not active in assembly of the p 
clamp and therefore the slight amount of activity In following fractions 
was probably due to a slight amount of y which trailed into the peak of 
the 88* complex thus giving activity in the assay. The column fractions 
5 of the y8 and y8' mixtures were inactive except for the peak fraction of y 
th the y8' ananlysi yj\]\oh supported weak activity, there was a slight, 
barely detectable ampunt of 8' (but hot 8), In the tractions containing y 
as though a slight ambuht of y8* complex Was fornied and survived the 
column. 

1 0 Following these studies with 8 and 8\ the present invention has 

found that 8 behaved as a monomer in gel filtratioh and glycerol 
gradient sedimentation; The 8' subunit also appeared mohomeric. 
Neither 8 or 8\ when separate, formed a gel filterable complex with the 
y feubunit. Yet they most likely bind to y (at legist weakly) as indicated 

15 by activity assays In which y8 is active (without 8') in assembly of the 
p damp, and 8* (without 8) stimulates the DNA dependent ATPase 
activity of y. The 8 and 8* subunits bound each other to form a gel 
filterable 1:1 5i S'l cdrpplex and when mixed with y they efficiently 
fbrriied a tight gel filterable y88' complex. Hence, the binding of 8 and 8* 

20 to y is cooperative. 

The 8' subunit is a mixture of two related proteins, 8'|_ and 8's 
Which are encoded by the same gene; 8'L is 521 da larger than the gene 
sequence predicts. The functional and structural difference between 
them is presently unknown. In these binding studies, both 8's and 81 

25 bound to 8 and they both assembled into the y8S' and t88' complexes, 
consistent with the fact that both 8'l and 8's are observed withion 
pollll and the y complex. 

No single subunit of the y complex is active in assembling the p 
clamp on DNA. Presumably this reacton is to complicated for just one 

3 0 protein. A mixture of y and 8 is capable of assembling p onto DNA 
although they are inefficient and require 8' tor efficient activity. 
Perhaps 8' increases the efficiency of y8 by physically bringing y and 8 
together in the y58' complex, although it is also possible that 8' 
participates directly in the chemistry of the reaction. The y subunit has 





a low level of DNA dependent ATPase activity, and described above, 8 
binds the p subunit. These two facts allow speculation that y binds the 
primed template, and 8 brings In the p subunit, then ATP hydrolysis is 
coupled to assemble the ring shaped p dimer arouhd the DNA. 

Since y Is known to bind ATP and has a low level of DNA 
dependent ATPase activity, it is an obvious candidate as the subunit 
Which Interacts with the ATP in the p clamp assembly reaction. Two 
molecules of ATP are' hydrolyzed in the Initiation reaction in which the 
holoehzyme becomes clamped onto a primed template to form the 
Initiation complex. This initiation reaction has Its basis in the 
assembly of the p clamp on DNA. The stoichiorrietry of two ATP 
hydrolyzed in formation of one initiation complex suggests two 
proteins hydrdlyze ATP. These two proteins may be the two halves of a 
Y dirtier. However it is also possible that 8 interacts with ATP. The 
Sequence of 8 shows a very close .match to the consensus for an ATP 
binding site and UV induced cross-linking studies suggest that 8 binds 
AtP. The availability of 5 in quantity should now make possible a full 
description of the mechanism by which ATP is couplked to assemble the 
ring shaped p dimer around DNA. 

The third subunit according to the present invention, that of 6, 
was also identified, purified, cloned and sequenced. N-terminal 
analysis of the 8 peptide yielded the following sequence of 40 amino 

acids: 

"Set Leu Lys Asn Leu Ala Lys Leu Asp Gin Thr Glu Met Asp Lys 

5 10 15 

Val Asn Val Asp Leu Ala Ala Ala Gly Val Ala Phe Lys Glu Arg 
20 25 30 

Tyr Asn Met Pro Val He Ala Glu Ala Val 
35 40 
Based upon this sequence, two DNA probes were fashioned. These 
probes had the sequences of: 
"ATC CTG AAA. AAC CTG GCT AAA CTG GAT CAG ACT GAA ATG GAT AAA 45 
GTT AAC GTT GAT 57; and 

CTG GCT GCT GCT GGT GTT GCT TTT AAG GAA CGT TAT AAC ATG CCG 45 
GTT ATT GCT GAA 57. 




These two probes were also end-labelled with 32 P for use with 
Southern blot procedures. 

For Southern blot analysis, E. coli DNA Was cut with the 8 Kohara 
map enzymes [see Cell 50:495 (1987)]. The two probes described above 

5 Were Used to probe two Southern blots of E coli DNA. The bands (DNA 
fragments) in corrirnpn With the two blots were noted, as was their size. 
At least 3 positions on. the Kohara map of the E. coli chromosome were 
consistent with the Southern blot fragmentation pattern. 

Thus, based upon these findings, E. coli Dna digested with either 

10 EcoRV or Pvull following DNA extraction [see J.M.B. 3:208 (1961)] was 
run but in an agarose gel, and all the DNA in the size region of the gel 
cbrresoponding to the fragment size containing 9 for that enzyme (Pvull 
or EcoRV) from the Southern blot analysis, was extracted from the gel 
and cloned into M13mp18 and M3mp19 using conventional techniques. 

15 The M13 transformant DNAs were analyzed by Southern blot and probed 
Using the two probes described above. One M13 DNA was obtained with 
the 9 sequence. When this M13 9 was sequenced, however, not all the 
theta gene was present; the gene extended beyond the PvUll restriction 
site. The M13 9 was then used as a reagent to obtain the complete 9 

20 gene. 

A Kohara X phage (X336) was grown and the 9 gene in E. coli was 
excised using an EcoRV cut 2.7 kb fragment. Next, a filter containing 
all the Kohara X phage was probed using the partial 9 gene as the probe. 
Thus, it was possible to identify the X phage containing the full 9 gene. 
15 The holE gene was then cloned from the X phage into pUC18 and 

subsequently sequenced. The full genetic sequence for the 9 gene Was 

thus determined to be: 
Ttb ctg aag aat ctg orr aaa ctg gat caa ACA GAA ATG 39 

fiAT AAA GTG AAT GTC CAT TTG GCG GCG GCC GGG GTG GGA 78 
$ 0 TTT AAA GAA CHT. TAC AAT ATG CCG GTG ATC GCT GAA GCG 117 
SEC GAA CGT GAA CAG CCT GAA CAT TTG CGC AGC TQG TTT 156 
CGC GAG CGG CTT ATT GCC CAC CGT TTG GCT TCG GTC AAT 195 
CTG TCA CGT TTA CCT TAC GAG CCC AAA CTT AAA 228 




The open reading frame above predicts that e is a 76 amino acid 
protein of 8,629 Da. The underlined nucleotide sequence exactly 
matches the corresponding N-terminal sequence of 0. In addition, the 
upstream sequence contains two putative RNA polymerase promoter 
signals and a Shine-Dalgarno sequence. This upstream sequence is: 
""AG GCGTAGCGAA GQGAGCGTGC AGTTGAAGCC ATATTATCTA TTCCTTTTTG 52 
TAATAACTTT TTTACAGAGG ATAACXTTGILC^AICTCTG AGTOGAGGAT 102 
CATCAATTCC GGCTTGCCAT CCTGQCTCAC TCTTAGTAAC TTTTGCCCGC 152 
GAATGATGAG_GAGATEAAGA 172 

1 0 The downstream sequence begin with a stop codon: 

^""TAA AACTTATAC AGAGTTACAC TTTCTTACAT AACGCCTGCT AAATTATGAG 52 - 
O , TATTTTCTAA ACCGCACTCA TAATTTGCAG TCATTTTGAA AAGGAAGTCA 102 

ill ~ = ~* This translated into the peptide sequence: 

® 1 5 "Met Leu Lys Asn Leu Ala Lys Leu Asp Gin Thr Glu Met Asp Lys 

W > 5 10 15 

& Val Asn Val Asp Leu Ala Ala Ala Gly Val Ala Phe Lys Glu Arg 

W 20 25 30 

* . - Tyr Asn Met Pro Val lie Ala Glu Ala Val Glu Arg Glu Gin Pro 

•TBtS/^O 35 40 45 
Glu His Leu Arg Ser Trp Phe Arg Glu Arg Leu lie Ala His Arg 

% 50 55 60 

•2 Leu Ala Ser Val Asn Leu Ser Arg Leu Pro Tyr Glu Pro Lys Leu 

^ 65 70 75 

2 5 Lys 

76 

Using site-directed mutagenesis, the initial Met codon (AGA ATG) 
was mutated to CAT ATG (Ndel site) using an oligonucleotide with 15 
bases on either side of the mutation. This was then used to obtain the 

3 0 overproduction of the 9 gene in which the mp199 (a 2700 bp insert) was 

grown in strain CJ236 cells in the presence of uridine. The purified 
single stranded DNA from these cells was purified and hybridized with 
the Ndel mutation and replicated with the holoenzyme in vitro. XLI-Blue 
cells were transformed with the double stranded DNA product and ten 
3 5 plaques were selected for miniprep sequencing; all 10 plaques 

contained the mutation. The 9 sequence was excised from the DNA with 
Hindlll, Ndel, and the resulting 1 kbp fragment was inserted into pET-3C 
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[see Methods in Enzymology 185:60 (1990)]. The resulting pET-3C9 was 
used to transform competent cells [BL21(DE3)]. Single colonies of the 
transformed cells were grown in liquid media at 37° C to an OD of about 
0.6. induced with IPTG generally as described previously, and harvested 
5 post Induction. Successful overexpression of the 9 peptide was 

obtained using this ■ system. 

The N-terminal sequence analysis of 9 was examined as follows: 
Pollil was purified 'see J. Biol. Chem. 263:6570 (1988)] except that the 
last step using Seperose 6 was replaced with ah ATP-agarose column 
1 0 (Pharmacia, type il) which was eluted with a linear salt gradient After 
the 58' eluted from the ATP agarose column, a mixture of pure pollil and 
n y complex eluted together. This mixture was separated by column 
chromotography on MohoQ using a linear gradient of 0-0.4 M NaCI .n 
buffer A. The pollil' which was eluted after the yxV complex was used 

1 5 as the source of 9 subunit. The subunit was separated from a, x and e 

subunits of pollil' by electrophoresis in a 15% SDS polyacrylamide gel 
and was electroblotted (110 pmol) onto PVDF membrane. The 9 subun.t 
was visualized by Ponceau S stain, and the N-termihal sequence was 
determined to be: 

2(T~NH 2 -*fet Leu Lys Asn Leu Ala Lys Leu Asp Gin Tnr Glu Met Asp Lys 

Val Asn val Asp Leu Ala Ala Ala Gly Val Ala Phe Lys Glu Ala Tyr 

20 25 
Asn Met Pro Val He Ala Glu (Ala) (Val) 

2 5 35 

-In which the parenthesis indicate uncertain amino acid assignments. 

The 9 was Isolated using E. coli genomic DNA Isolated from strain 
C600 [see J. Mol. Biol. 3:208 (1961)]. cut with the Kohara panel of 
restriction enzymes (BamHI. Hindlll, EcoRI. EcoRV, Bgll. Kpnl. Pstl and 

3 0 Pvull) and separated In a 0.8% native agarose gel. The gel was 

depurinated (0.25 M HCI). denatured (0.5 M NaOH, 1.5 M NaCI) and 
neutralized (1 M Tris. 2 M NaCI, pH 5.0) prior to transfer of the DNA to 
Genescreen plus (DuPont New England Nuclear) using a Vacugene 
(Pharmacia) apparatus in the presence of 2xSSC (0.3 M NaCI, 0.3 M 
3 5 sodium citrate, pH 7.0). The membrane was air dried prior to 

hybridization. Two synthetic oligonucleotide DNA 57-mer probes were 
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designed based on the N-terminal sequence of 9 assuming the highest 
frequency of codon usage and favoring T over C in the wobble position, 
the two probes (5'->3') were: 

Theta 1 (codons i-19): 
5 ATG CTG AAA AAC CTG GCT AAA CTG GAT CAG ACT GAA ATG GAT 42 
AAA GTT AAC GTT GAT 57;and 

Theta 2 (codons 20-38): 
CTG GCT GCT GCT GGT GTT GCT TTT AAA GAA CGT TAT AAC ATG 42 
CCG GTT ATT GCT GAA 57 . 
10 ~" The DNA Stumers (100 pmol each) were 5' end-labelled using 1 
^ [-y- 32 P] ATP and T4 polynucleotide kinase, and then used to probe - 
Southern blot of the restricted E. coli genomic DNA. Two Southern blots 
Were hybridized individually using one or the other of the 57-mer 
probes overhlght in the same buffer as above except with an additional 

1 5 200 jig/ml of denatured salmon sperm DNA. The Southerns were washed 

iri 2XSDS at room temperature for 30 minutes, then 3 hours at 42°C 
(Changing the buffer each hour), then exposed to X-ray film. The Theta 1 
probe showed a single, bnd in 7 of the 8 restriction digests; the Theta 2 
probe consistently showed many bands in each lane Which were 
20 eliminated equally as the hybridization and washing conditions were 
gradually increased in stringency, suggesting that Theta 2 did not 
match the true sequence of the holE gene. After holE was cloned and 
seqeunced, it was found that 7 nucleotides of Theta 1 and 12 
nucleotides of Theta 2 did match the holE sequence. 

2 5 To clone the holE gene, 100 ng of E. coli DNA was digested with 

Pvull, and the small population of DNA fragments migrating in the 400 
to 600 bp range (the Southern blot using Theta 1 probe indicated holE 
Was on a 500 bp Pvull fragment) was extracted from the agarose gel, 
blunt-end ligated into Ml3mp18 digested with Hiricll, and transformed 

3 0 into competent XL1-6lue ceiis. Presence of the holE gene Was 

determined by Southern blot analysis of minilysate DNA prepared from 
recombinant colonies using the 5' end-labelled Theta 1 as a probe. One 
positive clone was obtained and sequenced; it contained approximately 
one-half of the holE gene (a Pvull site lies in the middle of holE). This 
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fragment of holE was uniformly labelled using the random primer 
labelling method, and used to screen the complete Kohara ordered 
lambda phage library of E. coli chromosomal DNA transferred onto a 
riylon membrane. Prehybridization and hybridization were conducted as 
described above except that the temperature was increased to 65°C and 
the Wash steps were rpore stringent (2XSSC, 0.2% SDS .next 1xSSC, and 
then 0.5xSSC at 65°C). A single phage clone X 19H3 (336 of the 
rhlniset) [see Cekk 50:495 (1987)] hybridized with both the genomic 
fragment and the Theta 1 probe. 

The phage and a 2.7 kb EcoRV fragment containing the 9 gene was 
excised, purified from a native agarose gel, and blunt-end ligated into 
the Hinclll site of Ml3mp19 to yield M13mp19-6. The 2.7 kb EcoFM- 
Hiridlll fragment from M13mp19-9was excised, gel purifeid, and 
directionally ligated into the corresponding sites of pl)C18 to generate 
pUG-6. Both strands of the holE gene in pUC-9 were sequenced using the 
sequenase kit [a- 35 S]dATP, and synthetic DNA 20-mers. This time the 
entire holE gene was present. 

An Ndel site was, generated at the start codon of the holE gene by 
the oligonucleotide site directed mutagenesis using a DNA 33-mer : 

ATGATGAGGA GATT ACATAT G CTGAAGAAT CTG 33 
containing an Ndel site, (underlined) at the start codon of holE to prime 
replication of Ml3mp19-9 viral ssDNA and using SSB and DNA 
polymerase III holoenzyme in place of DNA polymerase I. The Ndel site 
In the resultant phage (Ml3mp19-9-Ndel) was verified by DNA 
sequencing. An approximately 1 kb Ndel-Hindlll fragment was excised 
from M13mp19-9-Ndel and directionally ligated into the corresponding 
sites of pUCi8 to yield pUC-9-Ndel. A 1 kb Ndel/BamHI fragment from 
pUC-8-Ndel was then subcloned directionally into pET3c digested with 
both Ndel and BamHI to generate the overproducing plasmid, pET-8. 

Reconstitution assays contained 72 ng phage X 174 ssDNA (0.04 
pmol as circles) uniquely primed with a DNA 30-mer, 0.98 SSB (13.6 
pmol as tetramer) 10 ng 13 (0.13 pmol as dimer), and 4 ng y complex 
(0.02 pmol) in 20 mM Tris-HCI (pH 7.5), 8 mM MgCl2, 5 mM DTT, 4% 
glycerol, 40 ug/ml BSA, 0.5 mM ATP, 60 uM dGTP, and 0.1 mM EDTA in a 
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final volume of 25 ul (after addition of ae or ae9). The ae and oce9 
complexes were each preformed upon mixing 38 pmol each of a and e, 
and when present, 152 pmol of Bin 12.5 ul of 25 mM Trls-HCI (pH 7.5), 
2 mM DTT, 1 mM EDTA, 10% glycerol followed by Incubation for 1 hour 
5 at 15°C. These proteih complexes were diluted 30-fold in the same 
buffer just prior to addition to the assay on ice, then the assay tubs 
was shifted to 37°C for 6 minutes to allow reconstruction of the 
processive polymerase oh the primed ssDNA. DNA synthesis Was 
Initiated upon rapid addition of 60 uM dATP and 20 U.M [cx- 32 P]TTP, then 
10 quenched after 15 seconds and quantitated using DE81 paper. Proteins 
used in the resconstruction assays were purified, and their 
concentrations determined using BSA as a standard. 
J, ^ftieaoltowmg- synthetic DNA 56-mer was designed as a hooked 

primer template to assay 3'->5" exonuclease activity: 
15 ¥- 




S iTCTCA.GCT- 5 ' 

T f^^l- 1 1 

Q, ^Xhis. DNA 56-mer (75 pmol as 56-mer) was 3' end-labelled with 

20 75 pmol of [a- 32 P] dTTP (3000 Ci/mmol) using 200 units of terminal 
transferase under conditions specified by the manufacturer (Boehrnger) 
in a total volume of 100 ul followed by spin dialysis to remove 

\^cl^» rerriainin 9 ,ree nucleotide, y 

Prior to adding proteins to the assay, 9 was titrated into e upon 

25 incubating 2 ug e (70 pmol as monomer) with 9 (0-10 ug, 0-1.16 nmol 
as monomer) in a total volume of 10 ul buffer A containing 50 ug/ml 
BSA at 15°C for 1 hour. The e9 mixture was then diluted 100-fold using 
buffer A containing 50 ug/ml BSA. A 2.5 ul sample of diluted complex 
was added to 200 fmol 3'- 32 P-end-labelled rhispaired hook DNA in 12.5 

3 0 ul of 25 mM Tris-HCI (pH 7.5), 4% sucrose, 5 mM MgCl2, 8 mM DTT, and 
50 ug/ml BSA followed by a 3 minute incubation at 15°C. The reaction 
was quenched upon spotting 13 ul of the mixture onto a DE81 filter. 
The amount of mispaired nucleotide remaining was quantitated^ and 
subtracted from the total mispaired template added to obtain the 

3 5 amount of 3' mispaired nucleotide released. 
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Gel filtration was performed using HR 10/30 fast protein liquid 
chromatography Columns, Superdex 75 and SupefoSe 12, in buffer C. 
Samples containing either e, e or a alone, and mixtures of these 
subunits were incubated at 15°C for 1 hour. The entire sample was then 
5 injected onto the bolurtin and after collection the first 5.6 ml (Superose 
75) or 6.0 ml (supefdpe 12), fractions of 160 Ml Were collected and 
analyzed in 15% SbS .polyacrylamide gels. Protein standards were a 
mixture of proteins of 'known Stokes radius and Were also analyzed. 
Densitometry of stained gels was performed using a laser 
i0 densitometer, Ultrascan XL (Pharmacia-LKB). 

Subunits (a, 9, e) alone and mixtures of these subunits were - 
incubated 1 hour at 15°C (with 5% glycerol), then mixed with protein 
standards of known S value (50 \ig of each protein standard) and 
immediately layered onto 12.3 ml linear 10%-30% glycerol gradients in 

1 5 25 mM Tris-HCI (pH 7.5), 0.1 M NaCI, 1 mM EDTA. The gradients were 

cehtrifuged at 270,000 x g tor 44 hours (e, 9, and e9 complex) or 26 
hours (ae and <xe9) at 4°C. Fractions of 150 \i\ Were collected from the 
bottom of the tube and. analyzed in a 15% SDS-polyacrylamide ge! 
stained with Coomassie Blue. 
20 In summary, thd sequence of the N-terminal 40 amino acids of 9 

Were obtained from the 9 subunit within the pollir subassembly (<xe9x) 
of holoenzyme. This sequence did not match any previously identified in 
GehBank, and therefore the invention attempted to identify the holE 
gene using the Kohara restriction map of the E. coli chromosome. Two 

2 5 57-mer DNA probes were made based on the N-terminal amino acid 

sequence of 9 and were used in a Southern analysis of E. coli genomic 
DNA digested with the eight Kohara restriction map enzymes. One of 
the 57-mer probes hybridized to a single band in 7 of 8 bands obtained 
upon Southern analysis, indicating that these 7 fragments must overlap 

3 0 in the holE gene. The Kohara restriction map was searched, and four 

hear matches were located. Since which of these positions could not be 
distinguished in the Kohara map as the true holE gene, the small 500 bp 
Pvull fragment from genomic DNA was directly cloned into Ml3mp18. 
The DNA sequence of this Pvull fragment predicted an amino acid 
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sequence which matched exactly to the 40 residue N-terminal sequence 
of 6. However, this was only a partial clone of holE due to an internal 
Pvull site. The Pvull fragment and one of the synthetic 57-mers were 
subsequently used to probe the entire Kohara library of overlapping X 
5 phage on one membrane which identified the location of holE within X 
19H3 (No. 336 of the, miniset). 

The Kohara restriction map of the chromosome in the vicinity of 
5l 19H3 shows a close match to the fragment sizes obtained from the 
Southern analysis. The overlapping fragments identify the position of 
1 0 hdlE at 40.4 minutes oh the E. coli chromosome. DNA analysis showed 
two Bgll sites separated by 122 bp that span the Theta 1 57-mer probe, 
thus explaining the absence of a Bgll fragment in the Southern analysis 
in which a small fragment would have run off the end of the gel. This 
small fragment would also have been missed in the procedure used by 
15 Kohara, accounting for the single Bgll site shown on the map. 

A 2.7 kb EcoRV fragment was subcloned from X 19H3 into 
M13rtip18 and the holE gene was sequenced. The DNA predicts e is a 76 
amino acid protein of 8,647 Da, slightly smaller than the 10 kDa 
estimated from the mobility of 9 in a SDS-polyacrylamide gel. The pi of 
20 9 based on the amino acid composition is 9.79, suggesting it is basic, 
consistent with its ability to bind to phosphocellulose, but not to Q 
Sepharose. The molar extinction coefficient of 9 at 280 nm calculated 
from its single Trp and the two Tyr residues is 8,250 fvHcm- 1 . 

Site directed mutagenesis was performed on the holE gene cloned 
25 into Ml3mp18 to create an Ndel site at the initiator methionine. The 
holE gene was excised from the site mutated Ml3mp18, inserted into 
pUCl8 (in order to use a convenient BamHI site), then a 1 kb Ndel-BamHl 
fragment containing holE was ligated directionally into the Ndel and 
BamHI sites of pET3c to yield the pET-9 overproducing plasmid in which 
30 holE expression is driven by T7 RNA polymerase. The pET-9 was 
introduced into BL21(DE3) cells and upon induction of T7 RNA 
polymerase by IPTG, 9 was expressed to 63% of total cell protein. The 
induced subunit was freely soluble upon cell lysis and its purification 
was relatively straight-forward. Four liters of cells were lysed and 





300 mg of pure 9 wa9 obtained in 78% overall yield after column 
chromatography on Q sepharose, heparin agarose, and phosphocellulose. 

Specifically, the purification of 9 Was carried out by utilizing 
four liters of BL21(DE3) cells harboring the pET-9 expression plasmld 
5 Were groWn in 4 L of LB media containing 50 p.l.ftli carbehicillin. Upon 
0rbtoth to an OD600. 0.6, IPTG was added to 0.4 mM and the cells were 
iricubated at 37 6 C for ,2 hours further before they Were harvested by 
cehtrlfugatiotl (8.4 g wet Weight) at 4°C, resusperided in 15 ml of cold 
50 mM Trls-HCI (pH 7.5) and 10% sucrose, and stored at -70°C. The 

1 0 cells Were thawed and lysed by heat lysis. The cell lysate (Fraction j, 
20 ml, 880 mg) was diaiyzed (all procedures were performed at 4°C) for 
2 hours against 2 L of buffer A, and then diluted 2-fold with buffer A to 
st conductivity equal to 50 mM NaCI. The lysate Was then applied to a 55 
ml Q sepharose fast flow column equilibrated in buffer A. The 9 flowed 

1 5 through the column as analyzed by a Coomassie Blue stained 15% SDS 
pblyacrylamide gel and confirmed by the stimulation of the e 
exohuclease activity assay developed for 9. The Q sepharose flow 
through fraction (Fraction II, 81 ml, 543 mg) was then applied to a 50 
nil cblumn of heparin agarose (BioRad) which Was equilibrated In buffer 

20 A containing 50 mM NaCI. The flow through fraction containing 9 was 
approximately 95% pure 9(Fraction III, 110, 464 mg), and was diaiyzed 
overnight against 2 L buffer B, then applied to a 40 ml phosphocellulose 
column (P11, Whatman) equilibrated in buffer B. The column was 
washed with buffer B and 9 was eluted using a 400 ml linear gradient of 

25 10 mM to 200 mM sodium phosphate (pH 6.5) in buffer B. Eighty 

fractions were collected and analyzed for 9. Fractions 42-56 were 
pooled (Fraction IV, 68 rhl, 300 mg) and diaiyzed against 2 L buffer A 
prior to aliquoting and storage at -70°C. The protein concentration was 
determined using BSA as a standard. Concentration of pure 9 

30 determined by absorbance at 280 nm using e280 at 8,250 IvHcnr 1 was 
90% of the protein concentration. 
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Step total total specific fold % 

protein units 1 activity purifica- yield 
(mg) (units/mg) tion 



5 I Cell Lysate 880 2.7x106 3.1x10 3 1.0 100 

II QSephardse 543 2.3x106 4.2x1 0 3 1.4 85 

III Heparin , . 

Agarose 464 2.6x1 06 5.6x1 03 1.8 96 

IV Phospho- 

10 Cellulose 300 2.1x106 7.0x103 2.3 78 

^One unit is defined as the increase in fmol nucleotide released per 
minute relative to the same reaction with no 6 added (e alone). 
— — ' Throughout this description of the present invention, buffer A 
Was 20 mM Tris-HCI (pH 7.5), 10% glycerol, 0.5 mM EDTA, and 2 mM DTT; 

1 5 Suffer B was 10 mM NaP04 (pH 6.5), 10% glycerol, 0.5 mM EDTA, and 2 

mM DTT; and Buffer C was 25 mM Tris-HCI (pH 7.5). 10% glycerol, 1 mM 
EDTA and 100 mM NaCl. 

Studies of the purified cloned 0 showed it had the same amino 
terminal sequence as predicted by holE (and 9 within poll 1 1' used for 
20 electroblotting), proving that the it was indeed the purified protein 
encoded by the cloned gene. The activity of 8 (stimulation of e) co- 
purified with 9 throughout the preparation. 

In searching for activity, the subunit was tested for polymerase 
activity and for endonuclease, 3->5' exonuclease and 5"->3* exonuclease 

2 5 activities on ssDNA and dsDNA. However, no such activities were 

observed. 

Since 9 is one of the subunits of pollll core, it was examined for 
any effect it might exert on the DNA polymerase and 3'->5' exonuclease 
activities of a and e. Previous work compared the ability of ae and 

3 0 pollll core to form the rapid and processive polymerase with 

holoenzyme accessory proteins, but there was no significant difference 
between ae and the pollll core (<xe8 ) suggesting 9 had no role in the 
speed and processivity of synthesis. With pure 9 , assays could be 
performed by either adding 0 to ae or omitting 0. In a comparison of the 
3 5 efficiency of ae complex and ae0 complex in their ability to reconstitute 
the rapid processive polymerase with accessory proteins, the ae (or 
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ae9) was mixed with the y complex and B subunit in the presence of ATP 
and phage X174 ssDNA primed with a synthetic oligonucleotide and 
"coated" with SSB. The mixture was preincubated for 6 minutes at 37°C 
to allow the y complex time to transfer the B ring to DNA forming the 
5 preinitiation complex clamp and time for the polymerase to associate 
With the preinitiation complex. The rapid processive polymerase can 
fully replicate this template (5.4 kb) within 12 seconds. Replication 
Was then initiated by the addition of dATP and [a-32p]TTP, which were 
omitted from the preincubation, and the reaction was terminated after 
10 15 seconds. In this assay, the effect of 9 on the amount of DNA 

synthesis will be a reflection of either the speed or processivity of the 
polymerase or the binding efficiency of the polymerase to the 
preinitiation complex. Based on a previous comparison of ae and core, e 
Was hot expected to influence the speed or processivity of DNA 
1 5 synthesis. However, in the prior study, the relative affinity of ae and 
pdllll core for the preinitiation complex was not examined. 

The ae and aeO were titrated into this reconstitution assay and 
the results indicate that e had little influence in the assay. Therefore, 
d does not significantly increase the affinity of ae tor the preinitiation 
20 complex. These results are also consistent with prior conclusions. The 
accessory protein preinitiation complex greatly stimulates the activity 
of the a subunit (without e) in the reconstitution assay. However, this 
"d holoenzyme" was half as fast as the "ae holoenzyme* and is only 
processive for 1-3 kb. The ability of 9 to stimulate this "a holoenzyme" 
25 was tested in the absence of e. but the 9 subunit had no effect 

indicating that it did not increase the speed or processivity of the "a 

holoenzyme" either. 

9 was next examined for an effect on the 3'->5' exonuclease 
activity of e using a synthetic "hooked" primer template with a 3" 
30 terminal G-T mispair. A slight (3-fold), but reproducible stimulation of 
9 on excision of the 3' mismatched T residue by e Was observed. In the 
absence of e, addition of up to 1.0 ng of 9 released ho 3' terminal 
nucleotide. These results are compatible with an earlier study 
comparing 3' excision rates of pollll core and ae complex in which the 
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pollll core was approximately 3-fold faster than ae. Although a 3-fpld 
effect is not dramatic and may not be the true intracellular role of 6 , It 
is large enough to follow 9 through the purification procedure. The 
Stimulation of e exonUcleasS activity co-purified With e throughout the 
5 purification procedure arid the overall activity Was recovered In high 

yield. i .. 

The pollll core Subassembly of the holoenzyme consists of three 
subunits: e , a (polymerase), and e (3'->5" exonuclease). Gel filtration 
Was used to analyze the ability of these individual subunits according 

1 0 to the present invention to assemble into the pollll core assembly, a 
and G were mixed together and gel filtered; however, 0 did not 
comigrate with a. Upon mixing e and 9, a stable e9 complex was formed. 
The results of these studies are quite consistent With the activity 
analysis presented above in which 9 had no effect on the polymerase but 

1 5 a noticeable effect on the activity of e. 

It has been reported that a concentrated preparation of pollll 
core (18 u.M) was dimeric containing two molecules of pollll core which 
were presumed to be dimerized through interaction between their 9 
subunits since a concentrated solution of ae complex contained only one 

20 a and one e. However, in the gel filtration experiments of the present 
invention, the reconstituted pollll core migrates only slightly larger 
than the a subunit indicating that 9 did not act as an agent of pollll core 
dimerization. 

In gel filtration experiments performed at a concentration of 73 
25 uM a and 73 uM e in either the absence of 9 (ae only), the presence of a 
substoichiometric amount of 9 (molar ratio cc:e:0 of 1:1:0.5), or With 
excess 9 (molar ratio 1:1:3), showed that the presence of 9 did not 
increase the aggregation state (i.e., monomer to dimer). Thus, it may be 
considered that the ae complex by itself is a dimer. However, 
30 comparison of ae and pollll core with size standards in the gel 

filtration analysis show that they elute near the 158 kDa IgG standard 
indicating that they are monomeric, i.e. one of each in the complex. 
They have a Stokes radius of 49A which is substantially the radius 
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determined for the oe- complex (50A), and similar to the 54A Stokes 
radius determined in studies of the dilute monomeric pollll core. 

To increase confidence in the aggregation state of these 
reconstituted complexes, the study of the ae complex and reconstituted 
5 polllt core was extended to an analysis of their sedimentation behavior 
In glycerol gradients using the same concentration and ratio of subuhlts 
as Ifi the gel studies. Again the ae and ae9 essentially co-sedimented 
regardless of whether 9 was present. The ae complex and pollll core 
each sedimented with an S value close to that of the 150 kDa IgG size 
10 standard further indicating they are monomeric subassemblies. 

The native molecular weights of 9 , e and of the et complex were 
also determined using gel filtration and glycerol gradient 
sedimentation. The 8 and e subunits were first analyzed separately: 9 , 
by itself, elutes alter myoglobin which is 17.5 kDa, indicating 8 is a 

1 5 monomer (8.6 kDa) rather than a dimer of 17.2 kDa; e migrated just 

after an ovalbumin standard (43.5 kDa) consistent with e as a 28,5 , 
monomer rather than a 57 kDa dimer. 

to asses the native molecular masses of 8 , e and the e8 complex, 
the analysis was extended to sedimentation in glycerol gradients. The 

2 0 Stokes radius and S values of 8, e and e8 complex were determined by 

comparison to protein standards and their observed mass was 
calculated. The observed masses of 8, e and e8 are 11.6 kDa, 32.7 kDa 
and 35.5 kDa, respectively, values most consistent with 8 as a 8.6 kDa 
monomer, e as a 28.5 kDa monomer, and the e8 complex having a 

2 5 composition of ei9 1 (37.1 kDa); densitometric analysis of the e8 

complex yielded a molar ration of 1 mol of e to 0.8 mol 8, consistent 

With this composition. 

The fourth subunit according to the present Invention, that of 4*. 
was also identified, purified, cloned and sequenced. N-terminal 

3 0 analysis of the *¥ peptide yielded a protein which, when translated to 

Its genetic sequence was found to be identical to a portion of a much 
larger sequence described by Yoshikawa [see Mol. Gen. Genet. 209:481 
(1987)]. However, Yoshikawa's description was for a riml sequence 
from E. coli responsible for encoding an enzyme catalyzing acetylation 
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of the N-terminal portion of ribosomal protein S-18; his upstream 
sequencing from this gene's reading frame was purely accidental and he 
does not indicate any appreciation of the gene as a coding sequence for 
the 4* peptide. 

The amino acid sequence obtained from the H* peptide is: 

"Met Thr Ser Arg Arg Asp Trp Gin Leu Gin Gin Leu Gly He Thr 

' 5 10 .15 

Gin Trp Ser Leu Arg '.Arg Pro Gly Ala Leu Gin Gly Glu He Ala 

20 25 30 

He Ala He Pro Ala His Val Arg Leu Val Met Val Ala Asn Asp 

35 40 45 

Leu Pro Ala Leu Thr Asp Pro Leu Val Ser Asp Val Leu arg Ala _ 

50 55 60 

Leu Thr Val Ser Pro Asp Gin Val Leu Gin Leu Thr Pro Glu Lys 

65 70 75 

He Ala Met Leu Pro Gin Gly Ser His Cys Asn Ser Trp Arg Leu 

80 • 85 90 

Gly Thr Asp Glu Pro Leu Ser Leu Glu Gly Ala Gin Val Ala Ser 

95 100 105 

Pro Ala Leu Thr Asp Leu Arg Ala Asn Pro Thr Ala Arg Ala Ala 
110 H5 120 

Leu Trp Gin Gin He Cys Thr Tyr Glu His Asp Phe Phe Pro Gly 

125 130 135 

Asn Asp 

137 

~~ Using the information above, the sequence was translated into 
the genomic structure which is: 

"™ATG ACA TCC CGA CGA GAC TGG CAG TTA CAG CAA CTG GGC 39 
ATT ACC CAG TGG TCG CTG CGT CGC CCT GGC GOG TTG CAG 78 
GGC GAG ATT GCC ATT GOG ATC CCG GCA CAC GTC CGT CTG 117 
GTG ATG GTG GCA AAC GAT CTT CCC GCC CTG ACT GAT CCT 156 
TTA GTG AGC GAT GTT CTG CGC GCA TTA ACC GTC AGC CGC 195 
GAC CAG GTG CTG CAA .CTG ACG CCA GAA AAA ATC GCG ATG 234 
CTG CCG CAA GGC AGT CAC TGC AAC AGT TGG CGG TTG GGT 273 
ACT GAC GAA CCG CTA TCA CTG GAA GGC GOT CAG GTG GCA 312 
TCA CCG GCG CTC ACC GAT TTA CGG GCA AAC CCA ACG GCA 351^ 
CGC GCC GCG TTA TGG CAA CAA ATT TGC ACA TAT GAA CAC 390 
GAT TTC TTC CCT GGA AAC GAC 411 




In addition to the normal sequence for the genomic material, the 
gene also contains an internal Ndel site. 

The sequence above is preceded by an upstream sequence 
containing two underlined RNA polymerase promoter signals (TTGGCG 
5 and TATATT), and a Shine Dalgarno (AGGAG) sequence. The complete 
Upstream sequence ,is; 

GQCGATTATA QCCATATGJT GGCGCGGTA CGACGAATTT GCTATATTTG 50 
COTXCTGAC AACAGGAGCG ATTCGCT 77. 

In addition, the open reading frame is followed by a downstream 
1 0 sequence beginning with a stop codon: 

TGA TTTACCGGCA GCTTACCACA TTGAACAACG CGCCCACQCC TTTCCGTGGA 53 
GTGAAAAAAC GTTTGCCAGC AACCAGGGCG AGCGTTATCT CAACTTTCAG 103 . 

The ¥ gene was then produced by PCR using E coli genomic DNA 
and the following (5'->3') primers: 
1 5 primer 1 (Psi-N): 

GATT CCATAT G ACATCCCGA CGAGACT 27; and 

primer 2 (Psi-C): 
GACIGGAICC CTGCAGGCCG GTGAATGAGT 30 

As can be seen, primer 1 contains a Ndel site, and primer 2 
20 contains a BamHI site which have been underlined above. 

The PCR-produced DNA was used to clone the 4* gene into pET-3c 
expression plasmid using a two-step cloning procedure necessitated by 
the internal Ndel site in the nucleic acid sequence. Briefly this 
procedure involved cutting the PCR product with Ndel restriction 
25 Shzyme into two portions of 379 (Ndel to Ndel) and 543 (from Ndel to 
BamHI) bp. The 543 bp portion was ligated into plasmid pET-3c (4638 
bp) to form an intermediate pET-3ca (5217 bp). The pET-3ca was then 
linearized, and the 379 bp portion inserted to form the desired pET-3c 
plasmid containing the complete PCR product insert. 
3 0 The overexpressioh vector containing the complete insert was 

then Inserted into E. coli, and induced with IPTG as described herein, 
and overexpression (an increase to over 20% of total bacterial protein) 
of the 4* protein was seen. 
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The 4* protein was purified by first dissolving the cell membrane 
debris In 6 M urea followed by passing the resulting solutions through a 
hydroxylapetite column, which had been equilibrate previously with a 6 
M urea buffer (180 g urea, 12.5 ml 1 M Tris at pH 7.5, .5 ml of 0.5 M 
5 EDTA, and 1 ml of 1 M DTT), wherein the «P peptide will flow through 
While almost everything else in solution will be held within the column. 
The The 4* peptide outflow of the hydroxylapetite column Was then 
bound to a DEAE column, rinsed with buffer, and eluted with a gradient 
of NaCI. Fractions containing the T peptide were pooled, dialyzed twice 
1 0 agairtst 1 liter of buffer, and loaded onto a hexylamine column for final 
purification. Fractions from the hexylamine column containing the 4* 
peptide were eluted with a NaCI gradient (0.0 to 0.5 M), pooled and 
saved as pure 4* subunit peptide. 

Studies were also conducted to determine that the 4* gene 

1 5 according to the present invention encodes 4* subunit peptide. These 

studied determined that the N-terminal analysis of native 4 1 peptide is 
predicted by the 4* gene sequence according to the present invention; 
native 4* peptide was obtained and digested with trypsin and a few of 
the resulting peptides synthesized - the sequenced peptides were 

2 0 encoded by the gene sequence according to the present invention; the 

cloned/overproduced/pure 4* peptide made in accordance with the 
present invention comigrated with the 4» subunit peptide within the 
naturally occurring holoenzyme; and the 4* peptide produced from the 
sequence according to the present invention formed a yxV complex when 
25 mixed with yand % as would occur with natural components. 

The 7XV complex Was purified [see J. Bio Chem. 265:1179 (1990)] 
from 1.3 kg of the yh overproducing strain (HB101 (pNT203, pSK100). 
The v subunit was prepared from y and x by electrophoresis in a 15% 
SDS-polyacrylamide gel, then y was electroblotted onto PVDF 

3 0 membrane for N-terminal sequencing (220 pmol), and onto 

nitrocellulose membrane for tryptic digestion (300 pmol) followed by 
sequence analysis of tryptic peptides. Proteins were visualized by 
Ponceau S stain. The N-terminal analysis was determined to be : 
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NH2TSRRDDQLQQLGIT. Two internal tryptic peptides were determined 
to be: 

NH2-L9U Gly Thr Asp Glu Pro Leu Ser Leu Glu Glu Ala Gin Val Ala 

5 10 15 

n Ser Pro; and 

NH?-Ala Ala Leu Trp'.Gln Gin He Cys Thr Tyr Giu His Asp Phe Phe 

5 10 15 

10 Pro Ala 

" A 3.2 kb Pstl/BamHI (DNA modification enzymes, New Endland 

Bidlabs) fragment containing holD was excised from XSC\ and ligated 
directionally into the polylinker of Blue Script (Stratagene). The 1.5kb 
Accl fragment (one site is in the vector and one is in the insert) 

1 5 containing holD was excised, the ends filled in using Klenow 
polymerase, then ligated into pUC18 (cut with Accl and the end filled 
with Klenow) to yield pUC-\|/. Both strands of DNA Were sequenced by 
the chain termination method of Sanger using the sequenase kit 
[or 32 P] dATP (radiochemicals, New England Nuclear), and synthetic DNA 

20 17-mer (Oligos etc. Inc.). 

A 922 bp fragment was amplified from genomic DNA (strain 
C600) using Vent DNA polymerase and two synthetic primers, an 
upstream 32-mer (CAACAGGAGCGATTCQATAISA-CATCCCGACG), and a 
downstream 30-mer (GATTCGGAIGGCTGCAGGCCG-GTGAATGAGT). The 

2 5 first two nucleotides in the Ndel site (underlined) of the upstream 32- 
mer and the first 11 nucleotides of the downstream 30-mer (including 
the underlined BamHI sequence) are not complimentary to the genomic 
DNA. Amplification was performed using a thermocycler in a volume of 
100ul containing 1 ng genomic DNA, 1u.M each primer, and 2.5 units of 

3 0 Vent polymerase in 10 mM Tris-HCI (pH 8.3), 2 mM MgS04. 200 \iU each 
dATP, dCTP, dGTP and TTP. Twenty five cycles were performed: i 
minute at 94°C, 1 minute at 42°C, 2 minutes at 72°C. Amplified DNA 
was phenol extracted, ethanol precipitated, then digested wm\50 units 
Ndel in 100ul 20 mM Tris-acetate, 10 mM magnesium acetate, 1 mM 

3 5 DTT and 50 mM potassium acetate (final pH 7.9), overnight at 37°C. 



After confirming the Ndel digestion by agarose gel. 50 units of BamHI 
was added and digestion Was continued for 2 hours. The Ndel/Ndel 
fragrrient.which contained most of the holD gene, and the Ndel/BamHI 
fragrhent were separated in an agarose gel. electroeluted, 
phefidi/chloroform extracted, ethanol precipitated and dissolved in 10 
rflM Tris-HCl (pH 7.5), i mM EDTA. The holD gene was cloned Into pET3c 
iri two steps. First the Ndel/Baml fragment encoding the C-terminus of 
V was Hgated into pEt3c digested with Ndel and BamHI to generate 
pETyc-ter* (linearized with Ndel and dephosphorylated) to yield the 
pET-y overproducer. DNA sequencing of the pET-y confirmed the 
correct orientation of the Ndel/Ndel fragment. 

The 25 ul assay contained 72 ng M13mp18 ssDNA (0.03 pmol as 
circles) primed with a synthetic DNA 30-mer, 0.98 ug SSB (13.6 pmol 
as tetramer), 82 ng a e (0.52 pmol), and 33 ng p (0.29 pmol as dimer) in 
20 rtiM Tris-HCl (pH 7.5), 8 mM MgCI 2 . 40 mM NaCl. 5 mM DTT, 0.1 mM 
EDTA, 40 ug/ml BSA, 0.5 mM ATP, and 60 uM each dCTP and dGTP. 
Addition of x. V and ySV complex to the assay was as follows. The y55' 
complex, x and V was initially in 4 M urea) subunits Were 
preincubated before addition to the assay for 30 minutes at 4°C at 
0 concentrations of 2.4 ug/ml 755" complex (14.2 nM), 0-0.75 ug/ml X (45 
nM), and 0-0.75 ug/ml V (0-48 nM) in 25 mM Tris-HCl (pH 7.5), 2 mM 
DTT, 0.5 mM EDTA, 50 ug/ml BSA, 20% .glycerol (buffer B) (the 
concentration of urea in the preincubation was 8.5 mM or less). One- 
half ul of this protein mixture was added to the assay (urea was 0.17 
5 mM or less in ' the assay after addition of V ) then the assay was shifted 
to 37°C for 5 minutes to allow polymerase assembly before initiating 
DNA synthesis upon addition of dATP and [o.-32p] dTTP to 60 uM and 20 
jiM, respectively. After 20 seconds, DNA synthesis was quenched and 
quantitated as described in the accompanying report. Assays to 
0 quantitate e in purification were performed likewise except the protein 
preincubation contained 2.4 ug/ml ySS 1 (14.2 nM), 0.75 ug/ml X (45 nM) 
and up to 0.25 ug/ml of protein fraction containing 9. After the 30 
minute preincubation, 0.5 ul was added to the assay reaction. The SSB, 
p, y, and x subunits used in these studies were purified, and the x. 8 



a, e, 




and 5' subunits were -prepared from their respective overproducing 
strains. Concentrations of p, 8, 8', % and y were determined from their 
absorbance at 280nm using their molar extinction coefficients: p, 
17,900 lvHcm- 1 ; 8, 46*137 M-1cm- 1 ;8\ 60,136 M- 1 crTr 1 ;x, 29,160 M" 
5 1 cm -1 ; and y, 24,040 ivHcnr 1 . 

The uJ assay contained 140 ng Ml3mp18 ssDNA in 25 mM Tris- 
HCl (pH 7.5), and 8 mM MgCl2. 50 uM [r$ 2 -P]MP, 5.45 prriol yort (as 
dirners), 10.9 pmol x an d/ or V (as monomers) (unless indicated 
otherwise) and 1.4 ng SSB (19.4 pmol as tetramer) ( when present). 
'It) Mixtures of proteins (\|/ Was initially 2 mg/ml (0.13 mM) in 4 M urea) 
Were preihcubated 30 minutes on ice at 3.8 (iM of each subunit (as 
monomer) in 30 \i\ of 25 mM Tris-HCI (pH 7.5), 0.5 mM EDTA, 20% 
glycerol (0.1 M Urea final concentration) before addition to the assay 
(15 rnM urea final concentration). Assays Were Incubated at 37°C for 

15 60 minutes 5 minutes for assays containing x) then quenched upon 
Spotting 0.5 \i\ on polyethyleneimine thin layer plates (Brinkman 
Instruments Co.). After chromatography in 0.5 M LiCI, 1 M formic acid, 
the free phosphatd at the solvent front and ATP remaining hear the 
ofigih were quantitated by liquid scintillation, 

20 Samples of vy (45 ng, 3 nmol as monomer (initially in 4 M urea)), 

or a mixture of y (45 |ig; 3 nmol as monomer) with either y (65 jig, 0.7 
nmol as dimer) or t (98 \ig, 0.7 nmol as dimer) were incubated in 200 \i\ 
25 rriM Tris-HCI (pH 7,5), 0.1 M NaCI, 1 mM EDTA, 10% glycerol (0.5 M 
urea was present after addition of y) for 30 minutes at 15°C. The 

2 5 200ul sample was injected onto a Pharmacia HR 10/30 gel filtration 

column of either Superdex 75 or Superose 12 at a flow rate of 0.35 
ml/min in 25 mM Tris-HCI (pH 7.5), 0.1 M NaCI, 1 mM EDTA, 10% 
glycerol.. After the first 5.6 ml, fractions of 170 uJ were collected and 
analyzed in a 15% SDS polyacrylamide gel and the value of Kav was 

3 0 calculated. 

A sample of y (45 ng, 3 nmol as monomer, initially in 4 M urea) in 
200 n' 25 mM Tris-HCI (ph 7.5), 0.1 M NaCI, 1 mM EDTA, 5% glycerol (0.5 
M urea final concentration after addition of y) was layered onto a 12.3 
ml gradient of 10%-30% glycerol in 25 mM Tris-HCI (pH 7.5), 0.1 M NaCI, 
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1 mM EDTA. Protein .standards in 200 \i\ of the same buffer were loaded 
in ahother tube and the gradients were centrifuged at 270,000 x g for 
44 hours at 4°C. Fractions of 150 |xl were collected and analyzed in a 
15% SDS polyacrylamide gel stained with Coomassie Blue. 
5 The 788' complex Was formed upon incubation of 60 \ig 8 (1.55 

nftiol as monomer) qnd, 60 \ig 8' (1.62 nmol as monomer) with ah excess 
of 7 (600 ng, 6.4 nmol, as dimer) for 30 minutes at 15°C In 1 ml of 25 
rriM Tris-HC! (pH 7.5), 2 rtiM DTT, 0.5 mM EDTA, 20% glycerol (buffer A), 
the mixture was chrorriatOgraphed on a 1 ml HR 5/5 MonoQ column, and 
10 eiuted with 30 ml linear gradient of 0 M- 0.4 M NaCI In buffer A. The 
768' complex eluted at ah unique position, after the elution of free 8', 8 
and 7 (in that order) and was well resolved from the excess 7. The pure 
788' complex was dialyzed against buffer A to remove salt. Protein 
concentration was determined using BSA as a standard. Molarity of 788" 
15 was calculated from protein concentration assuming the 170 kDa mass 
of a complex with subunit composition 7281 8'1, the composition 
expected frorri stoichiorhetry of subunits in the 7 complex. 

The txv complex, Was prepared from 1.3 kg of E. coli and the y 
subunit was resolved from the 7 and x subunits in a SDS-polyacrylamide 
20 gel, then electroblotted onto PVDF membrane for analysis of the amino 
acid sequence of the amino terminus of y. The v was also 
electroblotted onto nitrocellulose followed by tryptic digestion, HPLC 
purification of peptides and sequence analysis of two tryptic peptides. 
Search of the GenBank for DNA sequences encoding these peptides 
25 identified a sequence which was published in a study of the riml gene 
[see Mol Gen Genet 209:481 (1987)]. In order to define the operon 
structure of this DNA, the DNA upstream of riml was sequenced. All 
three peptide sequences of y were in one reading frame located 
immediately upstream of riml at 99.3 minutes on the E. coli 
30 chromosome which putatively encodes v and referred to as holD. 

The promoter for holD underlined in the sequence has been 
identified previously as the promoter for the riml gene, encoding the 
acetylase of ribosomal protein S18, which initiates 29 nucleotides 
inside of holD. Hence, holD is in an operon of riml. Production of y was 
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inefficient relative to pi|xl protein as judged by the maxicell technique 
Which detected riml protein but not The promoter measured by 
Northern analysis was strong [see Mol Gen Genet 209:481 (1987)] and 
the Shine-Dalgarno sequence is a good match to the consensus sequence, 
as is the spacing from the ATG needed for sufficient translation. 
Although the cellular, abundance of \y is not known, If one assumes all 
tHe v sequestered within the holoenzyme, then it is present in very 
small amounts, there being only 10-20 copies of the holoenzyme in a 
cell. Perhaps the 3-1 1 fold more frequent use of some rare codons may 
contribute to inefficient translation (Leu (UUA), Ser (UCA and AGU), Pro 
(CCU and CCC), Thr (ACA), Arg (CGA and CGG)). 

The open reading frame of holD encodes a 137 amino acid protein 
of 15,174 Da. Amino terminal analysis of the y protein Within the yxV 
complex showed it Was missing the initiating methionine. The molar 
5 extinction coefficient of v calculated from its 4 Trp and 1 tyr is 
24,040 M- 1 cm' 1 . There is a potential for a leucine zipper at amino 
acid residues 25-53, although three prolines fall within the possible 
leucine zipper. There , is also a helix-turn-helix motif (A/GX3GX5I/V) at 
G22G26I33. but again prolines may preclude helix formation. There is 
0 no apparent nucleotide binding site or zinc finger motif. 

The polymerase chain reaction was used to amplify holD from 
genomic DNA. The synthetic DNA oligonucleotides used as primers were 
designed such that an Ndel site was formed at the initiating ATG of 
holD and a BamHI site was formed downstream of holD. The amplified 
5 holD gene was inserted into the Ndel/BamHI sites of pET3c in two steps 
to yield pET-\}/ in which holD is under control of a strong T7 promoter 
and is in a favorable context for translation. The sequence of holD in 
pET-v was found to be identical to that depicted in the sequence, and 
transformation into BL21(DE)plysS cells and subsequent induction of T7 
0 RNA polymerase with IPTG, the y protein was expressed to 
approximately 20% of total cell protein. 

The v protein was completely insoluble and resisted attempts to 
obtain even detectable amounts of soluble y (lower temperature during 
induction, shorter induction time, and extraction of the cell lysate with 
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1 M NaCI were tested); it was necessary to resort to solubilization of 
the induced cell debris with 6 M urea followed by column 
chromatography in urea. The y was approximately 40% of total protein 
in the solubilized cell debris and was purified to apparent homogeneity 
5 upon flowing it through hydroxyapatite, followed by column 

chromatography on DEAE sepharose and heparin agarose. By this 
procedure, 22 mg of pure y was obtained from 1 liter of cell culture in 
61% yield. The pure y remained in solution upon complete removal of 
the urea by dialysis as described in greater detail below. 
10 Four liters of E. coli cells (BL21(DE3)plysS) harboring the pET-y 

plasmid were grown at 37°C in LB media supplemehted with 1% glucose, 
10 ng/ml thiamin, 50 |ig/ml thymine, 100 ^ig/ml ampicillin and 30 
|ig/ml chloramphenicol. Upon reaching an OD of 1.0, IPTG was added to 
0.4 mM and after an additional 2 hours of growth at 37°C, the cells 

1 5 were harvested by centrifugation (20 g wet weight), resuspended in 20 

ml of 50 mM Tris-HCI (pH 7.5) and 10% sucrose (Tris-sucrose) and 
frozen at -70°C. The cells lysed upon thawing (due to lysozyme formed 
by the plysS plasmid), and the following components were added oh Ice 
to pack the DNA and precipitate the cell debris: 69 ml Tris-sucrose, 1.2 
20 ml unneutralized 2 M Tris base, 0.2 ml 1 M DTT f and 9 ml of heat lysis 
buffer (0.3 M spermidine, 1 M NaCI, 50 mM Tris-HCI (pH 7.5), 10% 
sucrose). After 30 minutes incubation on ice, the suspension was 
centrifuged in a GSA rotor at 10,000 rpm for 1 hour at 4°C. The cell 
debris pellet was resuspended in 50 ml buffer B using a dounce 

2 5 homogenizer (B pestle), then sonicated until the viscosity was greatly 

reduced (approximately 2 minutes total in 15 second intervals) and 
centrifuged in twb tubes at 18,000 rpm in a SS34 rotor for 30 minutes 
at 4°C. The pellet was resuspended in 50 ml buffer B containing 1 M 
NaCI using the dounce .homogenizer, then pelleted again. This was 

3 0 repeated, and followed again by homogenizing the pellet once again in 

50 ml buffer B and pelleted as was done initially. The following 
procedures were at 4°C using only one-fourth of the pellet (equivalent 
to 1 liter of the cell culture). The assay for y is described above, and 
column fractions were analyzed in 15% SDS polyacrylamide gels to 





directly visualized \he- y protein and aid the exclusion of contaminants 
during the pooling of column fractions. The pellet was solubilized in 25 
rtil buffer A containing freshly deionized 6 M urea. The solubilized 
pellet fraction (fraction I, 85 mg, 22 ml) was passed over a 10 ml 
$ column of hydroxyapatite and equilibrated in buffer A plus 6 M urea. 
The quantitatively flowed through the hydroxyapatite column giving 
substantial purification. The protein which flowed through the 
hydrbxyapatlte column was immediately loaded onto a 10 ml column of 
DEAE sephacel, equilibrated in buffer A containing 6 M freshly deionized 
10 urea, and eluted with a 100 ml gradient of 0-0.5 M NaCI In buffer A 
containing 6 M freshly deionized urea over a period of 4 hours. 
Fractions of 1.25 ml were collected and analyzed for y as described. 
Fractions were pooled and dialyzed overnight against 2 liters of buffer 
A containing 3 M freshly deionized urea and then loaded onto a 10 ml 

1 5 cdlurhh of hexylamine sepharose. The hexylamihe column was eluted 

with a 200 ml gradient of 0 M-0.5 M NaCI in buffer B containing 3 M 
freshly deionized Urea over a period of 4 hours. Eighty fractions were 
collected (2.5 ml each), and were analyzed for \|/, then fractions 
containing v were pooled (fraction IV, 21.6 mg) and urea was removed 
20 by extensive dialysis against 25 mM Tris-HCI (pH 7.5), 0.1 M NaCI, 0.5 
mM EDTA (3 changes of 2 liters each). Protein concentration was 
determined using BSA as a standard, except at the last step in which a 
more accurate assessment of concentration was performed by 
absorbance using the value e280 equal 24,040 M-icm" 1 calculated from 

2 5 the sequence of holD. After the absorbance measurement, DTT was 

added back to 5 mM and the y was aliquoted and stored at -70°C. 
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Fraction 



10 



15 



20 



25 



30 



35 



total 

protein 

(mg) 



total 
units^ 



specific 
activity 



fold 

purifica- 



yield 



(units/mg) tion 



I Solubilized 

pellet 85.0 104.7x10 7 12.0x10 6 1.0 

II Hydroxylapatite ( , 

42.5 95.9x10 7 22.6x10 6 1.8 

III DEAE Sepharose 

30.6 89.7x10 7 29.3x1 06 2.4 

IV Hexlyamine 
Sepharose 21.6 



100 
92 
86 
61 



63.9x10 7 29.6x1 0 6 2.4 
iOne unit is defined as pmol of nucleotide incorporated in one minute 
over and above the pmol incorporated in the assay in the absence of 
added y 

~ The pure y protein comigrated with the y subunit of pollll* 
(holbenzyme lacking only B) in a 15% SDS-polyacrylamide gel. Analysis 
of the N-termihal sequence of the pure cloned y matched that of the 
hdlD sequence and the sequence of the natural y witnin tne TCV 
complex indicating that the purified protein encoded by the gene had 
been cloned. 

The pure y appeared fully soluble in the absence of urea. 
However, a 2 mg/ml solution of y which appeared clear, and could not 
be sedimented in a table top centrifuge, still behaved as an aggregate in 
a gel filtration column. Therefore, even though y appeared soluble it 
was still an aggregate. The aggregated y had only weak activity in the 
replication assay and was inefficient in binding to other proteins in 
physical studies. Therefore before using y in assays or in physical 
binding experiments, urea was added to a concentration of 4 M to 
disaggregate y . Once disaggregated, the urea could be quickly removed 
by gel filtration and y behaved well during filtration in the absence of 
urea in the column buffer. However, upon standing a full day at high 
concentration (>1 mg/ml) in the absence of urea, it would aggregate 
again, y would work in urea provided the preparation was sufficiently 
concentrated 2 mg/ml) such that it could be diluted at least 8-fold (to 
0.5 M urea) for protein-protein interaction studies, 300-fold for 
AtPase assays, and 30,000-fold for replication assays. In 0.5 M urea, 
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the v bound to 7 and i, and also to the x subunit. v treated in this 
manner was also functional in stoichiometric amounts with other 
proteins in replication and ATPase assays. 

In a previous study, a yxV complex was purified by resolving the 8 
5 and 8' subunits out of the 7 complex leaving only a complex of yxv- 

Compared to 7, this,7xv complex was approximately 3-fold more active 
in reconstituting the processive polymerase with 6, 6, and ae at 
elevated salt concentration. The simplest explanation tor this result is 
that at elevated salt, 7x^8 is more active than 78 in assembling the 13 

1 0 ring around primed DNA. 

The present invention indicates that a mixture of the y, 8 and 8' 
subunits formed a stabile (gel filterable) 788' complex when the ae 
complex and p subunit were incubated with the 788' complex (with or 
without x and/or vy) in a reaction containing SSB "coated" Ml3mp18 

1 5 ssDNA primed with a synthetic DNA oligonucleotide and in the presence 
of 40 mM added NaCI, and the reaction was incubated at 37°C for 5 
minutes to allow the accessory proteins time to assemble the 
preinitiation complex clamp and for the a e to bind the preinitiation 
complex (the preinitiation complex is known to consist of a p" dimer 

20 ring clamped onto the DNA). Replication of the circular DNA was then 
initiated upon addition of the remaining dNTPs and was quenched after 
20 seconds, sufficient time for the rapid and processive holoenzyme to 
complete the circle. 

The results indicated that as y is titrated into the assay the 

25 replication activity increased approximately 3.5 : fold and plateaued at 
approximately 1 mol y (as monomer) per mol 788* complex, y (without x) 
stimulates 788* and x does not stimulate the reaction, but the presence 
of both x and y yields the most synthesis as though x does exert an 
influence on the assay but only when y is also present. 

3 0 Previously 7 was observed to contain a low level of DNA 

dependent ATPase activity (0.1 mol ATP hydrolyzed/mol 7/minute) 
compared to the ATPase of the 7 complex (6.8 mol ATP/mol 7 ^ 
complex/minute). The txv complex resolved out of the 7 complex 
appeared to contain approximately 3-4 fold more DNA dependent ATPase 
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activity than y suggesting that x and/or y stimulated the ATPase 
activity of y, or that there was an ATPase activity Inherent within % 
and/or \y. Now that the. holC and holD genes have hiade available pure % 
and v In quantity, they have been studied studied them tor ATPase 

5 activity and tor their effect on the DNA dependent ATPase activity ofy. 
As part of thps.6 studies of ATPase activity* all possible 
combinations of x; V ar)ci V have been tested. These assays were 
performed in the presehce of Mi3mp18 ssDNA, orie of the best DNA 
effectors in the previous study of the y complex ATPase activity. The 

1 0 results showed that v alone, x alone, and a mixture ot % and y had no 
detectable ATPase activity and therefore neither nor x would appear 
to have an intrinsic ATPase activity, although oh the basis of negative 
evidence we can not rule out the possibility of a cryptic ATPase; the y 
subunit has a weak ATPase activity. The x suburtit has no effect on the 

1 5 ATPase activity of y. However, addition of v to y Stimulated the ATPase 

activity of 7 approximately 3-fold. Titration of y into the ATPase assay 
showed y saturated the ATPase assay at approximately 2 mol y (as 
mohdmer) to 1 mol y (as dimer). Addition of the X subunit to the yy 
mixture resulted in a further 30% increase in ATPase activity. 

2 0 In the presence of SSB which "coats" the ssDNA, the ATPase 

activity of y, w and yxV were all greatly reduced (50-fold). However, of 
the remaining activity, the YXV complex was 4-fold more active than yy 
showing that x significantly stimulates the w ATPase Which the DNA is 
"coated" with SSB. 

2 5 The ATPase assay of \|/ and x was extended to the DNA dependent 

AtPase activity of the x subunit. The x and y suburiits are encoded by the 
same gene and, as a result, 1 contains the y sequence plus approximately 
another 24 kDa of protein which is responsible for both the ability of x 
to bind DNA and to bind the polymerase subunit, rt. In addition, t has a 

3 0 much greater DNA dependent ATPase activity than y, approximately 6- 

10 mol ATP hydrolyzed/mol t/minute for a 60-fold greater activity of 
t relative to y. 

Neither y, x. or a mixture of x a nd V had a significant influence on 
the ATPase activity of x. "Coating" the ssDNA with SSB reduced the 





ATPase activity of x 20-fold, and now the % and y subunits stimulated 
the x ATPase 10-fold to bring its activity back to about half of its value 
In the absence of SSB. In this case, with SSB present, the v stimulated 
t approximately 3-fold, and x. which had no effect oh t without y, 

5 stimulated the ATPase another 3-fold. 

To gain a better understanding of the molecule the present 
Invention studied the .hydrodynamic properties of In 9^ filtration and 
glycerol gradient sedimentation to determine whether is a monomer 
or a dimer (or larger). The Stokes radius of y was 19A upon comparing 

10 its position of elution from a gel filtration column with that of protein 
standard of know Stokes radius. The v eluted in the same position as 
myoglobin (17.5 kDa) Indicating y is a 15 kDa monomer rather than a 
dimer of 30 kba. The v protein sedimented with sin S value of 1 .95 
relative to several protein standards, and was slightly slower than 

1 5 rhybglobin which Is consistent with y as a monomer. If a protein has an 
asymmetric shape, its migration will not reflect its true Weight in 
either of these techniques. However the effect of asymmetric shape 
has opposite elfects in these techniques and can be eliminated by the 
fact that the shape factor cancels when the S value and Stokes radius 

20 are both combined in one mass equation. This calculation results in a 
native molecular mass for y of 15.76 kDa, close to the 15 kDa 
mohomeric mass of y calculated from its gene sequence. Hence y 
behaves as a monomer Under these conditions. The frictional 
coefficient of y calculated from its Stokes radius and native mass is 

25 i.1 3, slightly greater than 1.0 which indicates some asymmetry in the 
shape of \y. 

Although the initial use of 4 M urea would have monomerized v if 
It were a native dimer, the y preparation was diluted such that the 
concentration of urea was 0.5 M before it was applied to either the gel 
3 0 filtration column or the glycerol gradient, and the buffer used In the 
column and in the gradient contained no urea. Of course, one should 
still be concerned that 0.5 M urea is high enough to disaggregate a 
dimer of y and that the dimer hasn't time to reassociate during 
filtration and sedimentation. Yet under these very conditions it was 
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found that y forms a. protein-protein complex with y, with x and also 
with x- Therefore it seems likely that if y were naturally a dimer, that 
the dimer could have reformed under these same conditions Under which 
V can bind all these other subunits. Further, a moriomerlc nature of V Is 

5 hbt unusual as most subunits of the holoenzyme are monomers when 
isolated (a, e, 9, x. M", (only y, % and p are dimers). 

A complex of yxV can be purified from cells indicating that y or % 
(or both) must directly interact with y. 

Gel filtration of a mixture of y with a 4-fold molar excess of v 

1 0 showed that v coeluted With the y subunit followed later by the elution 
of the rest of the excess vy. Hence, the v subunit does In fact bind 
directly to 7. 

The fifth subunit according to the present invention, x. began 
With the N-terminal analysis of x which provided a sequence a portion 

1 5 of which, was found to have been related, in part, to the sequence of the 

xerB gene (see EMBO 8(5):1623 (1989)]. Although hot included in the 
1692 bp sequence in the publication, a fuller more complete sequence 
(from 1 to 2035) of the. xerB gene was provided to GenBank. In this 
submission, the "front-end" portion of the x gene according to the 
20 present invention was presented. However, in neither the publication 
nor In GenBank was the "front-end" portion as coding for a protein. 
Based upon the molecular weight of x as determined In a SDS-PAG gel 
analysis, the "front-end" portion reported in GenBank predicts 
approximately 70% of the expected length of x- 

2 5 A subsequent literature study located a gene named valS which 

Was located downstream of the xerB gene. It appeared (and was 
confirmed during the research leading to the present invention) that the 
X, in its entirety, must be located between the xerB and valS genes. 
Edman degradation amino acid sequencing Was performed on an 

3 0 Applied Biosystems 470A gas phase microsequencing apparatus. The 

Yxv complex of the holoenzyme Was purified, and 10 fig was 
electrophoresed in a 15% polyacrylamide gel [see Nature 227:680 
(1970)] followed by transfer to an Immobilon membrane PVDF 
(Millipore) for N-terminal sequence analysis as with the previous 
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subunits according to. the present invention. Internal peptide sequences 
were obtained by electrophoresis of 10 ng of the nv complex in 15% 
polyacrylamide gel* followed by transfer to nitrocellulose membrane, 
digestion by trypsin Jn sUu, and analysis by gas phase microsequencing. 

The x^r holC gene^ according to the present invention, is located 
at 96.5 minutes on the, E. coli chromosome and encodes a 147 amino acid 

protein of 16.6 kDa. , 

The recombinant X phage 5C4 from the overlapping X library of 

Kohara (see Cell 50:495 (1987)] was used in determining the DNA 
sequence of the x gene- The DNA fragment containing the x gene was 
Identified and cloned into pUC18 using conventional techniques. The. 
DNA sequence for both strands of the x gene were performed on the 
dubiex plasmid by the dideoxy chain termination method of Sanger using 
the Sequenase kit; sequencing reactions were analyzed on 6% 
polyacrylamide, 50% (w/v) urea gels. 

The sequence of the primers (5'->3') used for PCR amplification 
of the x gene during the cloning of the x gene are as follows: 

30-mer primer: . 
(XCCACMMLGAAAAACGCG ACGTTCTACC 30; 

28-mer primer: 
ACCC^AICC_AAACTGCCGG TGACATTC 28 

The 30-mer hybridizes over the initiation codon of x. and a two 
nucleotide mismatch results in a Ndel restriction site (underlined) at 
the ATG Initiation codon upon amplification of the gene. The 28-mer 
anneals within the valS gene downstream of the c gene; a BamHI 
restriction site (underlined) is embedded within the six nucleotides 
which do not hybridize to valS. 

Using these codons, subsequent studies isolated the c gene 
Sequence which is, according to the present invention: 
ATG AAA AAC GCG ACG TTC TAC CTT CTG GAC AAT GAC ACC 39 
ACC GTC GAT GGC TTA AGC GCC GTT GAG CAC CTG GTG TGT 78 
GAA ATT GCC GCA GAA CGT TGG CGC AGC GGT AAG CGC GTG 117^ 
CTC ATC GCC TGT GAA GAT GAA AAG CAG GCT TAC GCC CTG 156 
GAT GAA GCC CTG TGG GCG CGT CCG GCA GAA AGC TTT GTT 195 




CCG CAT AAT TTA GCG. GGA GAA GGA CCG CGC GGC GGT GTA 234 
COG GTG GAG ATC GCC TGG CCG CAA AAG CGT AQC AGC AQC 273 
COG CGC GAT ATA TTG ATT AGT CTG CGA ACA AGC TTT GCA 312 
GAT TTT GCC ACC GCT TTT ACA GAA GTG GTA GAC TTC GTT 351 
5 OCT CAT GAA GAT TCT CTG AAA CAA CTG GCG CGC GAA CGC 390 
TAT AAA GCC TAC CGC, GTG GCT GGT TTC AAC CTG AAT ACG 429 
GCA ACT TGG AAA 441 

The upstream portion of the holC gene is: 
TAACQGCGAA GAGTAATTGC GTCAGGCAAG GCTGTTATJG_IX23GATGCGG 50 
1 0 CGTGAACGCC TTATCCGA CC TACACAGCAC TGAACTCGTA GGCCTGATAA 100 ' 
GACACAACAG CGTCGCATCA GGCGCTGCGG TGTATACCTG ATGCGTATTT 150 
AAATCCACCA CAAGAAG CCC CATTT 175 
* The downstream sequence begins with the stop codon: 

TAA TG GAAAA GACATATAAC CCACAAGATA TCGAACAGCC 40 
| 5 GCTTTACGAG CACTGGGAAA AAAGCCAGGA MGTTTCTGC 80 
ATCATGATCC CGCCGCCGAA 100 

The underlined nucleotide sequences indicate the potential Shine- 

Dalgamo sequence (AAGAAG) of holC and the nearest possible promoter 
signals (TTGCCG and TATCCG) are highlighted in the first underlined 
20 region. The stop coddn of the upstream XerB gene (TAA) and the start 
coddri of the downstream ValS gene (ATG) are each double underlined. 
This translated into the correct peptide which is: 
~ s " = Met Lys Asp Ala Thr Phe Tyr Leu Leu Asp Asn Asp Thr Thr Val 

5 10 15 

25 Asp Gly Leu Ser Ala Val Glu Gin Leu Val Cys Glu He Ala Ala 

20 25 30 

Glu Arg Trp Arg Ser Gly Lys Arg Val Leu He Ala Cys Glu Asp 
• 35 40 45 

Glu Lys Gin Ala Tyr Arg Leu Asp Glu Ala Leu Trp Ala Arg Pro 
3 0 50 55 60 

Ala Glu Ser Phe Val Pro His Asn Leu Ala Gly Glu Gly Pro Arg 
65 70 75 

Gly Gly Ala Pro Val Glu He Ala Trp Pro Gin Lys Arg Ser Ser 
80 85 90 

3 5 Ser Arg Arg Asp He Leu He Ser Leu Arg Thr Ser Phe Ala Asp 

95 100 105 

Phe Ala Thr Ala Phe Thr Glu Val Val Asp Phe Val Pro Tyr Glu 
110 H5 120 
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Asp Ser Leu Lys Glrv Leu Ala Arg Glu Arg Tyr Lys Ala Tyr Arg 
125 130 135 

Val Ala Gly Phe Asn Leu Asn Thr Ala Thr Trp Lys 
140 145 147 

~5~ EXAMPLE IV 

(molecular cloning, cell growth and purification) 

PCR reactions were performed with both Vent polymerase 

(Biolabs) and Taq polymerase. In a i 00 \i\ volume, the PCR reaction was 

cbnducted in a reaction buffer containing 10 mM Tris-HCI (pH 8.3), 50 

1 0 mM KCI, 1.5 mM MgCl2, and 0.01% (w/v) gelatin, 1.0 \iU of each primer, 

and 200 uM each dNTP (Pharmacia-LKB), on 1 ng E. coli genomic DNA 

(prepared from K12 strain C600) with 2.5 u polymerase. PCR 

amplification was performed in a DNA Thermal cycler model 9801 

using the following cycle: melting temperature 94°C for 1 min, 

1 $ annealing temperature 60°C for 2 min, and extension temperature 72°C 

for 2 min. After 30 cycles, the amplified DNA was purified by phenol 
extraction in 2% SDS followed by sequential digestion of 10 \ig DNA 
With 10 u Ndel, followed by 10 u BamHI. The 600 bp DNA fragment was 
purified from a 0.8% agarose gel by electroelution, and ligated Into 
20 pET3c previously digested with both Ndel and BamHI restriction 
enzymes. The resulting plasmids (pETx-1, pETx-2 and pETx-3) were 
ligated into E. coli strain BL21(DE3)pLysS. 

The freshly transformed BL21(DE3)pLysSpETx cells were grown 
in LB media containing 100 ng/ml ampicillin and 30 ug/ml 

2 5 chloramphenicol at 37°C. Upon growth to an OD600 of 0.7, isopropyl B- 

D-galactopyranoside (IPTG) was added to a final concentration of 0.4 
mM. Incubation was continued for 3 hr at 37°C before harvesting the 
cells. 

Seven mg of homogeneous x protein was purified from a 4-liter 

3 0 induced culture in which nearly 30% overproduced % protein was in 

soluble form. The 4-liter culture was grown in an OD600 of 0.7 before 
addition of IPTG to 0.4 mM. After a further 3 hr incubation at 37°C, the 
cells (25 g) were harvested, resuspended in 25 ml Ice-cold 50 mM 
tris/10% sucrose, and lysed by 25 mg lysozyme on ice for 45 min and a 
3 5 subsequent incubation at 37°C for 5 min in 5 mM Tris, 1% sucrose, 30 
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mM spermidine, and 100 mM NaCI. The cell lysate was clarified by 
centrifugation at 12,000 rpm for 1 hr at 4°C. All subsequent column 
chromatography procedures were at 4°C. All the columns were 
equilibrated in buffer A (20 mM Tris (pH 7.5), 0.5 rhM EDTA, 5 mM DTT, 
5 and 20% glycerol). The x protein was followed through the purification 
process by SDS-PAGE gel analysis. Total protein was estimated [see 
Ahal. Biochem 72:248.(1976)] using bovine serum albumin as a standard. 
The soluble lysate (120 ml, 520 mg protein) was dialyzed against 4 
litef buffer A for 16 hours before being loaded onto a 60 ml heparin- 

1 0 agarose column. The fractions containing %. which eluted off the 

column during wash with buffer A, were pooled (360 ml, 365 mg 
protein), and loaded directly onto a FPLC 26/10 Q sepharose fast flow 
column. The Q sepharose fast flow column was eluted with a 650-ml 
linear gradient of 0 M to 0.5 M NaCI in buffer A. The fractions 

15 containing x. eluted at' approximately 0.16 M salt in a volume of 45 ml 
(60 mg protein), were pooled, dialyzed overnight against 4 liter buffer 
A, and loaded onto a i ml N-6 ATP-agarose column. The y complex 
(Y88'X¥) binds to the column tightly due to the strong ATP binding 
capacity of y, while x protein by itself flows through. This column was 

20 included to eliminate any y complex from the x preparation. 

The flow-through, of the ATP-agarose column was loaded onto an 
8 ml hexylamine column and x was eluted with an 80 ml linear gradient 
of 0 to 0.5 M NaCI in buffer A. The x protein eluted at approximately 
0.25 M salt. Fractions containing the peptide (81 ml, 36 mg protein) 

2 5 were pooled and dialyzed against buffer A. The x protein was loaded 

onto an 8 ml FPLC Mono Q column, and eluted with a 80 ml linear 
gradient of 0 to 0.5 M NaCI in buffer A. The fractions containing x (28 
nil 8.5 mg protein) eluted sharply at 0.16 M salt. The x protein was 
pooled and dialyzed overnight against 4 liters of buffer A, then 

3 0 aliquoted and stored at -70°C. 

The concentration of purified x protein was determined from its 
absorbance at 280 nm. The molar extinction coefficient at 280. nm 
(e280) o f a protein in Its native state can be calculated from its gene 
sequence to within +/- 5% by using the equation E280 - Trp m (5690M- 




"Icm" 1 ) + Tyr n (1280M- 1 cm- 1 ) [see Analytical Biochemistry 182:319 
(1989)] wherein m and h are the numbers of tryptophan and tyrosine 
residues, respectively, in the peptide. The molar extinction 
coefficients of tryptophan and tyrosine are known [see Biochemistry 
5 6:1948 (1967)]. For x protein where m equals 4 and n equals 5, C280 = 
29»160 fvHcnr 1 . The x protein was dialyzed against buffer A 
containing no DTT prior to absorbance measurement. SDS-PAG was in 
15% polyacrylamide 0.1% SDS gel in Tris/glycine/SDS buffer [see 
Nature 227:680 (1970)]. Proteins were visualized by Coorhassie 
1 0 Brilliant Blue Stain. 

The yx¥ complex Was purifeid from 1.3 kg of E. coli. The x subunit 
was resolved from the y and y subunits upon electorphoresis in a 15% 
SDS polyacrylamide gel followed by transfer of % onto PVDF membrane 
for N-terminal sequence analysis (210 pmol), and onto nitrocellulose 
1 5 membrane for tryptic analysis (300 pmol). Proteins were visualized by 
=a= J»onceau S stain. The amino terminus of c was determined to be: 

NH2-Met Lys Asn Ala Thr Phe Tyr Leu Leu Asp Asn Asp Thr Thr Val 

.5 10 15 

Asp Gly Leu Ser Ala Val Glu Gin Leu Val Xxx Glu He Ala 
20 ■ 20 25 

"""wherein Xxx is an unidentified residue. Tryptic digestion and analysis 
of four internal peptides were determined to be: 



X-2: 

NH2-Leu Asp Glu Ala Leu Trp Ala Ala Pro Ala Glu Ser Phe Val Pro 

5 10 15 

His Asn Leu Ala Gly Glu 



25 



X-1: 

NH2~Val Leu He Ala Xxx Glu Asp Glu Lys 

5 





The 3.4 kb BarnHI fragment containing holC was excised from \ 
5C4 and ligated into the BarnHI site of pUC-x- Both strands of the holC 
gene were sequenced on the duplex plasmid by the chain termination 
method of Sanger ( and synthetic 17-mer DNA oligonucleotides. 
5 Sequencing reactions Were analyzed on 6% (w/v) acrylamide, 50% (w/v) 
urea gels and were performed with both dGTP and DITP. 

The sequences of the primers used to amplify the holC gene were: 
upstream 30-mer: 

CCCC ACATAT G AAAAACGCG ACGTTCTACC 30 
10 Downstream 28-rrier: 

ACC CGGATCC AAACTGCCGG TGAOGTTC 28 
The upstream 30-mer hybridizes over the initiation codon of holC, and a 
two-nucleotide mismatch results in a Ndel restriction site (underlined 
above) at the ATG initiation codon upon amplification of the gene. The 
1 5 ddWhstream 28-mer sequence within the valS gene downstream of holC. 
A BarnHI restriction site (underlined) is embedded in the sequence 
resulting in three nucleotides which are not complementary to valS. 
Amplification reactions contained 1.0 nM of each primer, 200 u.M of 
each dNTP, 1 ng E. coli genomic DNA (from strain C600). and 2.5 units of 
20 Taq I DNA polymerase in a final volume of 100 jil 10 mM Tris-HCI (pH 
8.3). 50 mM KCl, 1.5 mM MgCl2, and 0.01% (w/v) gelatin. Amplification 
was performed in a thermal cycler using the following cycle: 94°C, 1 
minute; 60°C, 2 minutes;- and 72°C, 2 minutes. After 30 cycles, the 
amplified 604 bp DNA was purified by phenol extraction in 2% SDS 
25 followed by sequential digestion of 10 \ig DNA in 10 units of Ndel and 
then 10 units of BarnHI according to manufacturer's specifications. The 
Ndel-BamHI fragment was electroeluted from a 0.8% agarose gel and 
ligated into gel purified pET3c previously digested with both Ndel and 
BarnHI. The resulting plasmid, pET-x was sequenced which confirmed 
3 0 that no errors had been introduced during amplification, and it was then 
transformed into straih BL21(DE3)plysS. 

The y subunit was purified from an overproducing strain, and the 
8, 8' and y subunits were purified from their respective overproducing 
strains as described above. A mixture of 48 ng y (0.51 nmol as dimer), 
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144 ng 8 (3.7 nmol as' monomer}, 144 u.g 8' (3.9 nmol as monomer), and 
192 ng v (12.7 nmol as monomer) was incubated at 15°C for 1 hour and 
then loaded onto a 1 ml HR 5/5 Mono Q column. The concentration of y 
was determined using BSA as a standard. Concentrations of 8, 8' and v 
5 wete determined from their absorbance at 280 nm using the molar 
extinction coefficients. 46,137 M-1cm-1, 60,136 M-1cm-1, and 24040 
M-icnr 1 , respective^ The column was eluted with a 32 ml gradient 
Of 0 M - 0.4 M NaCI in 20 mM Tris-HCI (pH 7.5), 0.5 mM EDTA, 2 mM DTT, 
find 20% glycerol (bufler A) whereupon the y88> complex resolved from 
1 0 uhcomplexed subunits by eluting later than all the rest. Eighty 

fractions were collected and analyzed by a Coomassie Blue stained 15% 
SDS polyacrylamide gel. Fractions containing the y88> complex, were 
pooled, the protein concentration was determined using BSA as a 
stahdard, and then the y55> complex was aliquoted and stored at -70°C. 
1 5 Molarity of 788' was calculated from the protein concentration assuming 
the 185 kDa mass calculated from gene sequences assuming a 
stoichiometry of 7281 8*1 \j/ 1 as expected from the tentative 
stoichiometry of subunits in the y complex. 

The reconstitution assay contained 72 ng M13mp18 ssDNA (0.03 
20 pmol as circles) uniquely primed with a DNA 30-mer, 980 ng SSB (13.6 
pmol as tetramer), 10 ng (3 (0.13 p mol as dimer), 55 ng cce complex 
(0.35 pmol) in a final volume (after addition of proteins) of 25 u.1 20 
mM Tris-HCI (pH 7.5), 0.1 rriM EDTA, 8 mM MgCl2, 5 mM DTT, 4% glycerol, 
40 u.g/ml BSA, 0.5 mM ATP, 40 mM NaCI, 60 u.M each dCTP, dGTP, dATP, 
25 and 20 uM [a- 32 P]. Pure x protein or column pool containing x (1-12 ng) 
was preincubated on ice for 30 minutes with 37 hg y88> complex (0.2 
pmol) in 20 ul of 20 mM Tris-HCI (pH 7.5), 2 mM DTT, 0.5 mM EDTA, 20% 
glycerol, and 50 u.g/ml BSA before dilution with the same buffer such 
that 0.14 ng (0.76 fmol) of the ?88> complex was added to the assay in a 
3 0 1-2 |xl volume. The assay was then shifted to 37°C for 5 minutes. DNA 
synthesis was quenched by spotting directly onto DE81 filter paper and 
quantitated. The cxe complex, (3 and SSB proteins used in the ^ 
reconstitution assay were purified and their concentrations determined 



7^ 
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using BSA as a standard except for B which was determined by 
absorbance using an e280 value of 17,900 M" 1 cm _1 . 

Gel filtration analysis was performed Using the Pharmacia HR 
10/30 fast protein liquid chromatography columns^ Superdex 75 and 
5 SUperose 12. Proteins Were incubated together tor i hour at 15°C in a 
final volume of 200 ,ul, buffer B (25 mM Tris-HCI (pH 7.5), 1 mM EDTA, 
10% glycerol, and 100, mM NaCI). The y protein Was first brought to 4 M 
in urea to disaggregate it, and when present with other proteins the 
final concentration of urea in buffer B was 0.5 M. The entire sample 
1 0 was injected, the column was developed With buffer B, and after 
collecting the first 6 ml, fractions of 170 U.I were collected. The % 
protein was located in column fractions by analysis in 15% SDS- 
polyacrylamide gels. Densitometry of Coomassie Blue-stained gels was 
performed using a laser densitometer (Ultrascan XL). 

1 5 Individual samples of x (46 ng, 2.8 nmol as monomer) and of y (45 

ng, 3 nmol as monomer), and a mixture of x (218 ng, 13 nmol as 
monomer) and y (45 ng, 3 nmol as monomer) were incubated 30 minutes 
at 4°C in 200 ul buffer, B with 5% glycerol (samples containing y 
contained a final concentration of 0.5 M urea in the 200 |il as explained 

2 0 above). Samples were layered onto 12.3 ml gradients of 10%-30% 

glycerol in 25 mM Tris-HCI (pH 7.5), 0.1 M NaCI and 1 mM EDTA. Protein 
standards in 200 uJ of buffer B with 5% glycerol were layered onto 
another gradient and the gradients were centrifuged at 270,000 x g for 
44 hours at 4°C. Fractions were collected and analyzed. 

2 5 The polymerase chain reaction was used to precisely clone the 

holC gene into the T7 based pET expression system [see Methods in 
Enzymology 185:60 (1990)]. Primers upstream and downstream of holC 
were synthesized to amplify a 604 bp fragment containing the holC gene 
from E. coli genomic DNA. The upstream primer hybridized over the 

3 0 start codon of holC and included two mismatched nucleotides in order 

to create an Ndel restriction site at the initiating ATG. The primer 
downstream of holC included three mismatched nucleotides to create a 
BamHI restriction site. The amplified 604 bp fragment was digested 
with Ndel and BamHI and cloned into the Ndel-BamHI isite of the T7 
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based expression vector pET3c to yield pET-x- In the pET-x plasmid, the 
holC gene is under the control of a strong T7 RNA polymerase promoter 
and an efficient Shine-Dalgarno sequence in favorable context for 
translation initiation. DNA sequencing of the pET-x plasmid showed Its 
5 sequence was identical to that of pUC-x, and therefore no errors were 
incurred during amplification. 

The pET-x expression plasmid was introduced into strain 
6L21(DE)plyS which Is a lysogen carrying the T7 RNA polymerase gene 
under the control bf the IPTG-inducible lac UV5 promoter. Upon 

1 0 Induction with IPTG and continued growth for 3 hours, the % protein was 

expressed to a level of 27% total cell protein. Upon cell lysis, only - 
about 30% of the x protein was in the soluble fraction, the rest being 
found in the cell debris. Induction at lower temperature (20°C) or for 
Shorter times did not appear to increase the proportion of % in the 

15 soluble fraction. 

Four liters of induced cells were lysed and 38 mg of pure xwas 
obtained in 38% overall yield upon fractionation with ammonium sulfate 
precipitation, followed by column chromatography using Q sepharose 
and heparin agarose. The x protein was well behaved throughout the 

20 purification and showed no tendency to aggregate. The N-terminal 
sequence analysis of the pure cloned x matched that of the holC gene 
indicating that the protein had been successfully cloned and purified, 
the expressed c protein also comigrated with the authentic % subunit 
contained within poll II*. 

2 5 In summary, as a result of the present invention, the location and 

sequence of % was determined. The x subunit (400 pmol) was separated 
from y and x subunits of the YXV complex by SDS denaturation and 
resolution on a 15% polyacrylamide gel, and 100 pmol transferred to a 
PVDF immobilon membrane for amino terminal sequence analysis; the 

3 0 remainder was transferred to nitrocellulose for sequence analysis of 

internal peptides following trypsin digestion. After transfer, the 
protein was visualized by Ponceau S stain and excised from the gel. The 
sequence of the N-terminal amino acids and four internal peptides were 
determined as described above, and these sequences were used to 



search the GenBank database. One single exact match was found at 
about 96.5 minutes on the E. coli chromosome between the xerB and 
valS genes. 

The recombinant Kohara X clone 5C4, contains the DNA fragment 
5 encompassing the xerB and partial valS genes, and the x gene was 
Subcloned by ligation pt the BamHI fragment of X 5C4 into pUC18. 
Sequence analysis was performed directly on the plasmid. As shown 
above, the open reading frame of the x gene was 441 nucleotides long. 
Its initiation codon is 160 nucleotides downstream of the stop codon of 
1 0 the xerB gene, while its termination codon, TAA, has one base 

overlapping with the start codon of the valS gene. Since the xerB and % 
genes were transcribed in the same direction, and that no promoter 
consensus sequences were found for the x gene alone, suggests that 
these two genes are in the same operon. 

1 5 When PCR was applied to clone the xgene into the T7 based 

expression system, PCR primers based upon the known sequences of the 
xerB and valS genes were made to amplify the fragment between the 
two genes. As described, E coli genomic DNA was used as the PCR 
template, and a fragment of approximately 600 base pairs was 

2 0 amplified. The PCR fragment, after being digested with Ndel and BamHI, 

was cloned into the Ndel-BamHI site of the expression vector pET3c in 
similar manner to what was done with the preceding gene sequences. 
Thus, the putative x gene was put under the control of a strong T7 RNA 
polymerase promoter gene as well as the efficient translation 

2 5 initiation signal, and transcription termination sequence downstream 

of the BamHI site. Direct DNA sequencing of the plasmids formed 
showed that they were all identical to the sequence of the X clone. 

The resulting plasmids were transformed into E. coli 
BL2l(DE)pLysS that contained a lysogen carrying the T7 RNA polymerase 

3 0 gene under the control of the IPTG-inducible lac UV5 promoter [see 

Methods in Enzymology, 185:60 (1990)]. Transformants were selected by 
ampicillin and chloramphenicol resistance, and subsequently subjected 
to IPTG induction as described above. A protein of about 17 kDa was 
overproduced in all three PCR clones. The y complex was run in parallel 



with the three clones on SDS-PAG gel, and when the overproduced and 
the x subunit were at similar amounts, they showed the same gel 
mobility. This observation supported the identity of the overproduced 
protein as the x subunit. 

In addition to the specific sequences provided above for the 
individual genes according to the present invention, the present 
invention also extends, to mutations, deletions and additions to these 
sequences provided such changes do not substantially affect the present 
properties of the listed sequences. 

As described, the naturally occurring holoenzyme consists of 10 
protein subunits and is capable of extending DNA faster than 
polymerase I, and producing a product many times larger then the 
polymerase I enzyme. Thus, these unique properties of the 5, preferably 
6, active subunits of the present invention are likely to find wide 
application in, for example, long chain PCR - using the active sequence 
according to the present invention PCR can be performed over several 
tens of kb; PCR performed at room temperature - the active sequence 
according to the present invention is uniquely adapted to be a 
polymerase of choice ftir PCR at room temperature due to its high 
fidelity; extension of site mutated primers without catalyzing strand 
displacement; and for .sequencing operations wherein other polymerases 
find difficulty. Other uses will become more apparent to those skilled 
in the art as the science of molecular genetics continues to progress. 

The sequence listing for the nucleic acid sequences and peptide 
sequences which are contained in the present description is as follows: 
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Thus, while I have illustrated and described the preferred 
embodiment of my invention, it is to be Understood that this invention 
is c&pable of variation ahd modification, and I therefore do hot wish to 
be limited to the precise terms set forth, but desire to avail myself of 
such changes and alterations which may be made for adapting the 
invention to various, usages and conditions. Accordingly, such changes 
ahd alterations are! properly intended to be Within the full range of 
equivalents, and therefore within the purview of the following claims. 

Having thus described my invention and the manner and a process 
of making and using it tn such full, clear, concise and exact terms so as 
to Enable any person skilled in the art to which it pertains, or With 
which it is most nearly connected, to make and use the same; 
I claim: 




(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 
Met Leu Arg Leu Tyr Pro Glu Gin Leu Arg Ala Gin Leu Asn 

5 ^ 10 

Gly Leu Arg Ala Ala Tyr Leu Leu Leu GJ 
20. X S 25 



Glu 
15 




(2) INFORMATION FOR SEQ/ID NO: ,2 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 amino acids 

(B) TYPE/ amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE/ TYPE: peptide 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

Ala Ala Tyr Leu Leu Leu Gly Asn Asp Pro Leu Leu/Leu Gin 

5_- 10 / 

Glu Ser Gin Asp-Al a, Va l-Arg, 
15. ., , 20" 

(2) INFORMATION FOR SEQ ID NO: 3: 





(i) SEQUENCE CHARACTERISTICS: 
1 '(A) LENGTH: 14 amino acids 
! (B) TYPE: amino acid / 

(D) TOPOLOGY: linear/ 

(ii) MOLECULE TYPE: peptide /' 

(xi) SEQUENCE DESCRIPTION:SEQ ID NO: 3: 
Ala Gin GluVsn Ala Ala Trp Phe Thr Ala Leu Ala Asn Arg 
\ 5 / 10 14 

(2) INFORMATION^OR SEQ ID NO: 4: 
; (i) sedUENC^CHARAatERlStlCS: 
(A) LENGT;Hf 24 arriiho acids 



(B) TYPE: / ^arriino acid 
(D)TOP(^kXX3Y 



linear 




(ii) MOLECUUE TYPE: peptide 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

Val Glu Gin Ala Val Asn Asp Ala Ala His Phe Thr Pro Phe His 
J 5 10 15 

Trp Val Asp/Ala Leu Leu Met Gly Lys 
20 24 



• 8. ^ 



(2) INFORMATION FOR.SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 base pairs 

(B) TYPE: nucleic acid - 

(C) STRANDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 
GTACAACCGA ATCATATGTT ACCCAGCGAG CTC 33 

(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1032 bp 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

ATG ATT CGG TTG TAC CCG GAA CAA CTC CGC GCG CAG CTC 39 

AAT GAA GGG CTG CGC GCG GCG TAT CTT TTA CTT GGT AAC 78 

GAT CCT CTG TTA TTG CAG GAA AQC CAG GAC GCT GTT CGT 117 

CAG GTA GCT GCG GCA CAA GGA TTC GAA GAA CAC CAC ACT 156 

TTT TCC ATT GAT CCC AAC ACT GAC TGG AAT GCG ATC TTT 195 

TCG TTA TGC CAG GCT- ATG AGT CTG TTT GCC AGT CGA CAA 234 

ACG CTA TTG CTG TTG TTA CCA GAA AAC GGA CCG AAT GCG 273 

GCG ATC AAT GAG CAA CTT CTC ACA CTC ACC GGA CTT CTG 312 

CAT GAC GAC CTG CTG TTG ATC GTC CGC GGT AAT AAA TTA 351 

AGC AAA GCG CAA GAA AAT GCC GCC TGG TTT ACT GCG CTT 390 

GCG AAT CGC AGC GTG CAG GTG ACC TGT CAG ACA CCG GAG 429 

CAG GCT CAG CTT CCC CGC TGG GTT GCT GCG CGC GCA AAA 468 

CAG CTC AAC TTA GAA CTG GAT GAC GCG GCA AAT CAG GTG 507 

CTC TGC TAC TGT TAT GAA GGT AAC CTG CTG GCG CTG GCT 546 

CAG GCA CTG GAG CGT TTA TCG CTG CTC TGG CCA GAC GGC 585 

AAA TTG ACA TTA CCG CGC GTT GAA CAG GCG GTG AAT GAT 624 

GCC GCG CAT TTC ACC CCT TTT CAT TGG GTT GAT GCT TTG 663 

TTG ATG GGA AAA AGT AAG CGC GCA TTG CAT ATT CTT CAG 702 

CAA CTG CGT CTG GAA GGC AGC GAA CCG GTT ATT TTG TTG 741 

CGC ACA TTA CAA CGT GAA CTG TTG TTA CTG GTT AAC CTG 780 

AAA CGC CAG TCT GCC CAT ACG CCA CTG CGT GCG TTG TTT 819 

GAT AAG CAT CGG GTA TGG CAG AAC CGC CGG GGC ATG ATG 858 

GGC GAG GCG TTA AAT CGC TTA AGT CAG ACG CAG TTA CGT 897 

CAG GCC GTG CAA CTC CTG ACA CGA ACG GAA CTC ACC CTC 936 

AAA CAA GAT TAC GGT CAG TCA GTG TGG GCA GAG CTG GAA 975 




GGG TTA TCT CTT CTQ TTG TGC CAT AAA CCC CTG GCG GAC 1014 
GTA TTT ATC GAC GGT TGA 1032 

(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 127 bp 

(B) TYPE: nucleic acid 

(C) STRA'NDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

CCGAACAGCT GATTCGTAAG CTGCCAAGCA TCCGTGCTGC GGATATTCGT 50 
TCCGACGAAG AACAGACGTC GACCACAACG GATACTCCGG CAACGCCTGC 100 
ACGCGTCTCC ACCACGCTGG GTAACTG 127 

(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 105 bp 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

TGA ATGAAATCT TTACAGGCTC TGTTTGGCGG CACCTTTGAT 43 
CCGGTGCACT ATGGTCATCT AAAACCCGTT GGAAGCGTGG CCGAAGTTTT 93 
GATTGGTCTG AC 105 

(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 343 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 
Met lie Arg Leu Tyr Pro Glu Gin Leu Arg Ala Gin Leu Asn Glu 

5 10 15 

Gly Leu Arg Ala Ala Tyr Leu Leu Leu Gly Asn Asp Pro Leu Leu 
20 25 30 

Leu Gin Glu Ser Gin Asp Ala Val Arg Gin Val Ala Ala Ala Gin 
35 40 45 

Gly Phe Glu Glu His His Thr Phe Ser He Asp Pro Asn Thr Asp 
50 55 60 
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Trp Asn Ala He Phe Ser Leu Cys Gin Ala Met Ser Leu Phe Ala 
65 70 75 

Ser Arg Gin Thr Leu Leu Leu Leu Leu Pro Glu Asn Gly Pro Asn 
80 85 90 

Ala Ala He Asn Glu Gin Leu Leu Thr Leu Thr Gly Leu Leu His 
95 100 105 

Asp Asp Leu Leu Leu He Val Arg Gly Asn Lys Leu Ser Lys Ala 
HO . 115 120 

Gin Glu Asn Ala Ala. Trp Phe Thr Ala Leu Ala Asn Arg Ser Val 
125' 130 135 

Gin Val Thr Cys Gin Thr Pro Glu Gin Ala Gin Leu Pro Arg Trp 
140 145 150 

Val Ala Ala Arg Ala Lys Gin Leu Asn Leu Glu Leu Asp Asp Ala 
155 160 165 

Ala Asn Gin Val Leu Cys Tyr Cys Tyr Glu Gly Asn Leu Leu Asn 
170 175 180 

Leu Ala Gin Ala Leu Glu Arg Leu Ser Leu Leu Trp Pro Asp Gly 
185 , 190 195 

Lys Leu Thr Leu Pro Arg Val Glu Gin Ala Val Asn Asp Ala Ala 
I 200 205 210 

His Phe Thr Pro Phe His Trp Val Asp Ala Leu Leu Met Gly Lys 
215 220 225 

Ser Lys Arg Ala Leu His He Leu Gin Gin Leu Arg Leu Gly Gly 
230 235 240 

i Ser Glu Pro Val He Leu Leu Arg Thr Leu Gin Arg Glu Leu Leu 

245 250 255 

Leu Leu Val Asn Leu Lys Arg Gin Ser Ala His Thr Pro Leu Arg 
260 265 270 

Ala Leu Phe Asp Lys His Arg Val Trp Gin Asn Arg Arg Gly Met 
) 275 280 285 

Met Gly Glu Ala Leu Asn Arg Leu Ser Gin Thr Gin Leu Arg Gin 
290 295 300 

Ala Val Glh Leu Leu Thr Arg Thr Glu Leu Thr Leu Lys Gin Asp 
. 305 310 315 

5 Tyr Gly Gin Ser Val Trp Ala Glu Leu Glu Gly Leu Ser Leu Leu 

320 325 330 

Leu Cys His Lys Pro Leu Ala Asp Val Phe He Asp Gly 
335 340 343 

(2) INFORMATION FOR SEQ ID NO: 10: 
0 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 334 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

Met Arg Trp Tyr Pro Trp Leu Arg Pro Asp Phe Glu Lys Leu Val 

5 10 15 

Ala Ser Tyr Gin Ala Gly Arg Gly His His Ala Leu Leu He Gin 
20 25 30 

Ala Leu Pro Gly Met Gly Asp Asp Ala Leu He Tyr Ala Leu Ser 
35 40 45 

Arq Tyr Leu Leu Cys -Gin Gin Pro Gin Gly His lys Ser Cys Gly 
50 55 60 

His Cys Arg Gly Cys Gin Leu Met Gin Ala Gly Thr His Pro Asp 

65 70 75 . 

Tyr Tyr Thr Leu Ala Pro Glu Lys Gly Lys Asn Thr Leu Gly Val 
80 85 90 

Asp Ala Val Arg Glu Val Thr Glu Lys Leu Asn Glu His Ala Arg 

95 100 105 

Leu Gly Gly Ala Lys Val Val Trp Val Thr Asp Ala Ala Leu Leu 
HO ' 115 120 

Thr Asp Ala Ala Ala Asn Ala Leu Leu Lys Thr Leu Glu Glu Pro 
125 130 135 

Pro Ala Glu Thr Trp Phe Phe Leu Ala Thr Arg Glu Pro Glu Arg 
140 145 150 

Leu Leu Ala Thr Leu Arg Ser Arg Cys Arg Leu His Tyr Leu Ala 
155 160 165 

Pro Pro Pro Glu Gin Tyr Ala Val Thr Trp Leu Ser Arg Glu Val 
170 , 175 180 

Thr Met Ser Gin Asp Ala Leu Leu Ala Ala Leu Arg Leu Ser Ala 
185 190 195 

Glv Ser Pro Gly Ala Ala Leu Ala Leu Phe Gin Gly Asp Asn Trp 
200 205 210 

Gin Ala Arg Glu Thr Leu Cys Gin Ala Leu Ala Tyr Ser Val Pro 
215 220 225 

Ser Gly Asp Trp Tyr Ser Leu Leu Ala Ala Leu Asn His Glu Gin 
230 235 240 

Ala Pro Ala Arg Leu His Trp Leu Ala Thr Leu Leu Met Asp Ala 
245 250 255 

Leu Lys Arg His His Gly Ala Ala Gin Val Thr Asn Val Asp Val 
260 265 270 

Pro Gly Leu Val Ala Glu Leu Ala Asn His Leu Ser Pro Ser Arg 
275 280 285 

Leu Gin Ala He Leu Gly Asp Val Cys His He Arg Glu Gin Leu 
290 295 300 

Met Ser Val Thr Gly He Asn Arg Glu Leu Leu He Thr Asp Leu 
305 310 315 




Leu Leu Arg He Glu His Tyr Leu Gin Pro Gly Val Val Leu Pro 
320 325 330 

Val Pro His Leu 
334 

(2) INFORMATION FOR SEQ ID NO: 1 1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 57 nucleic acids 

(B) TYPE:- nucleic acid 

(C) STRANDEDNESSS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1 1 : 

ACTCTGGAAG AACCGCCQGC TGAAACTTGG T1T1T1CTGG CTACTCGTGA 50 
ACCGGAA 57 

(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 54 bp 

(B) TYPE: nucleic acid 

(C) STRANDEDNESSS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

GCTGGTTCTC CGGGTGCTGC TCTGGCTCTG TTTCAGGGTG ATGACTGGCA 50 
QGCT 54 

(2) INFORMATION FOR SEQ ID NO: 13:- 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1002 bp 

(B) TYPE: nucleic acid 

(C) STRANDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

ATG AGA TGG TAT CCA TGG TTA CGA OCT GAT TTC GAA AAA 
CTG GTA GCC AGC TAT CAG GCC GGA AGA GGT CAC CAT GCG 
CTA CTC ATT CAG GCG TTA CCG GGC ATG GGC GAT GAT GCT 
TTA ATC TAC GCC CTG AGC CGT TAT TTA CTC TGC CAA CAA 
CCG CAG GGC CAC AAA AGT TGC GGT CAC TGT CGT GGA TGT 
CAG TTG ATG CAG GCT GGC ACG CAT CCC GAT TAC TAC ACC 
CTG GCT CCC GAA AAA GGA AAA AAT ACG CTG GGC GTT GAT 



39 
78 
117 
156 
195 
234 
273 



GCG GTA CGT GAG GTO ACC GAA MG CTG AAT GAG CAC GCA 312 

CGC TTA GGT GGT GCG AAA GTC GTT TGG GTA ACC GAT GCT 351 

GCC TTA CTA ACC GAC GCC GCG GCT AAC GCA TTG CTG AAA 390 

ACG CTT GAA GAG CCA CCA GCA GAA ACT TGG TTT TTC CTG 429 

5 GCT ACC CGC GAG CCT GAA CGT TTA CTG GCA ACA TTA CGT 468 

AGT CGT TGT CGG TTA CAT TAC CTT GCG CCG CCG CCG GAA 507 

CAG TAC GCC GTG ACC TGG CTT TCA CGC GAA GTG ACA ATG 546 

TCA CAG GAT GCA TTA 1 CTT GCC GCA TTG CGC TTA AGC GCC 585 

GGT TCG CCT GGC GCG 'GCA CTG GCG TTG TTT CAG GGA GAT 624 

10 AAC TGG CAG GCT CGT GAA ACA TTG TGT CAG GCG TTG GCA 663 

TAT AGC GTG CCA TCG GGC GAT TGG TAT TCG CTG CIA GCG 702 

GCC CTT AAT CAT GAA CAA GTC CCG GCG CGT TTA CAC TGG 741 

CTG GCA ACG TTG CTG ATG GAT GCG CTA AAA CGC CAT CAT 780 

GGT GCT GCG CAG GTG ACC AAT GTT GAT GTG CCG GGC CTG 819 

15 GTC GCC GAA CTG GCA AAC CAT CTT TCT CCC TCG CGC CTG 858 

CAG GCT ATA CTG GGG GAT GTT TGC CAC ATT CGT GAA CAG 897 

TTA ATG TCT GIT ACA GGC ATC AAC CGC GAG CTT CTC ATC 936 

ACC GAT CTT TrA CTG CGT ATT GAG CAT TAC CTG CAA CCG 975 
GGC GTT GTG CTA CCG GTT CCT CAT CTT 1002 

2 0 (2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 157 bp 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

2 5 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

AAGAATCTTT CGATTTCTTT AATCGCACCC GCGCCCGCTA TCTGGAACTG 50 

GCAGCACAAG ATAAAAGCAT TCATACCATT GATGCCACCC AGCCGCTGGA 100 

3 0 GGCCGTGATG GATGCAATCC GCACTACCGT GACCCACTGG GTGAAGGAGT 150 

TGGACGC 157 

(2) INFORMATION' FOR SEQ ID NO: 1 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 143 bp 
3 5 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 
40 TTA GAGAGACATC ATGTTTTTAG TCGACTCACA CTGCCATCTC 43 



n 



r 



9 0 



GATGGTCTGG ATl'ATGAATC TTTGCATMG GACGTGGATG ACGTTCTGGC 93 
GAAAGCCGCC GCACGCGATG TGAAATTTTG TCTGGCAGTC GCCACAACAT 143 

(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

5 (A) LENGTH: 16 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE tYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 
1 0 Met Arg Trp Tyr Pro Pro Leu Arg Pro Asp Phe Glu Lys Leu Val Ala 

5 10 15 

(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 1 1 amino acids 



(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: peptide 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 
Glu Val Thr Glu Lys Leu Asn Glu His Ala Arg; 
20 5 10 

(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 amino acids 

(B) TYPE: amino acid 

2 5 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 
Val Val Trp VaJ. Thr Asp Ala Ala Leu Leu Thr Asp Ala Ala 
5 10 

3 0 Asn Ala Leu Leu Lys 

20 . 

(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 25 amino acids 
35 (B) TYPE: amino acid 



1 5 



(B) TYPE: t amino acid 



(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: peptide 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 
Thr Leu Glu Glu Pro Pro Ala Glu Thr Trp Phe Phe Leu Ala Thr 
5 10 15 

Arg Glu Pro Glu Arg Leu Leu Ala Thr Leu 
5 20 25 

(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGtH: 18 amino acids 

(B) TYPE: amino acid 
10 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 
Leu His Tyr Leu Ala Pro Pro Pro Glu Gin Tyr Ala Val Thr Trp 
O '5 10 15 

03 15 Leu Ser Arg 
GO 18 

m (2) INFORMATION FOR SEQ ID NO: 21 : 

W (i) SEQUENCE CHARACTERISTICS: 

fU (A) LENGTH: 21 amino acids 

~ 20 (B) TYPE: ' amino acid 

(D) TOPOLOGY: linear 

y (ii) MOLECULE TYPE: peptide 

m (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 

5 Leu Ser Ala Gly Ser Pro Gly Ala Ala Leu Ala Leu Phe Gin Gly 

% 25 5 10 15 

~ 4 Asp Asn Trp Gin Ala Arg. 

20 

(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 
3 0 (A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 

3 5 Leu Gly Gly Ma Lys 

5 

(2) INFORMATION FOR SEQ ID NO: 23: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 58 amino acids 

(B) TYPE: amino acid 

(D) TOPOLOGY: linear . 

5 (ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 
Ala Cys Thr Cys Thr* Gly Gly Ala Ala Gly Ala Ala Cys Cys Gly 

5. 10 15 

Cys Cys Gly Gly Cys Thr Thr Gly Ala Ala Ala Cys Thr Thr Gly 
10 10 25 30 

Glv Thr Thr Thr Thr Thr Thr Cys Thr Gly Gly Cys Thr Ala Cys 
35 40 45 

Thr Cys Gly Thr Gly Ala Ala Cys Cys Gly Gly Ala Ala 
50 55 

1 5 (2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 54 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

20 (ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 
GlV Cys Thr Gly Gly Thr Thr Cys Thr Cys Cys Gly Gly Gly Thr 

5 10 15 

Glv Cvs Thr Gly Cys Thr Cys Thr Gly Gly Cys Thr Cys Thr Gly 
25 20 25 30 

Thr Thr Thr Cys Ala Gly Gly Gly Thr Gly Ala Thr Ala Ala Cys 
35 40 45 

Thr Gly Gly Cys Ala Gly Gly Cys Thr 
50 

3 0 (2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

3 5 (ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 
Gly Gly Thr Gly Ala Ala Gly Gly Ala Gly Thr Thr Gly Gly Ala 

5 10 15 




Cys Ala Thr Ala Thr. Gly Ala Gly Ala Thr Gly Gly Thr Ala Thr 
. 20 25 30 

Cys Cys Ala 

(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 amino acids 

(B) TYPE? amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 
Met Leu Lys Asn Leu Ala Lys Leu Asp Gin Thr Glu Met Asp Lys 

5 10 15 

Val Asn Val Asp Leu Ala Ala Ala Gly Val Ala Phe Lys Glu Arg 
20 25 30 

Tyr Asn Met Pro Val He Ala Glu Ala Val 
35 40 

(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 57 bp 

(B) TYPE: , nucleic acid 

(C) STRANDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 
ATGCTGAAAA ACCTGGCTAA ACTGGATCAG ACTGAAATGG ATAAAGTTAA 50 
CGTTGAT 57 

(2) INFORMATION FOR SEQ ID NO: 28: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 57 bp 

(B) TYPE: nucleic acid 

(C) STRANDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 

CTGGCTGCTG CTGGTGTTGC TTTTAAGGAA CGTTATAACA TGCCGGTTAT 50 
TGCTGAA 57 

(2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 




(A) LENGTH: 228 bp 

(B) TYPE: nucleic acid 

(C) STRANDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 

ATG CTG AAG AAT CTG GCT AAA CTG GAT CAA ACA GAA ATG 39 
GAT AAA GTG AAT GTC ' GAT TTG GCG GCG GCC GGG GTG GCA 78 
TTT AAA GAA CGC TAC AAT ATG CCG GTG ATC GCT GAA GCG 117 
GTT GAA CGT GAA CAG CCT GAA CAT TTG CGC AGC TGG TTT 156 
CGC GAG CGG CTT ATT GCC CAC CGT TTG GCT TCG GTC AAT 195 
CTG TCA CGT TTA CCT TAC GAG CCC AAA CTT AAA 228 

(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 172 bp 

(B) TYPE: , nucleic acid 

(C) STRANDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 

AG GCGTAGCGAA GGGAGCGTGC AGTTGAAGCC ATATTATCTA 42 
TTCCTTTTTG TAATAACTTT TTTACAGACG ATAACCTTGT CTAATGTCTG 92 
AGTCGAGGAT CATCAATTCC GGCTTGCCAT CCTGGCTCAC TCTTAGTAAC 142 
TTTTGCCCGC GAATGATGAG GAGATTAAGA 172 

(2) INFORMATION FOR SEQ ID NO: 31 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 107 bp 

(B) tYPE: nucleic acid 

(C) SfttANONESS: single 

(D) TOPOLOGY: I i hear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 

TAA AACTTATAC AGAGTTACAC TTTCTTACAT AACGCCTGCT 42 
AAATTATGAG TATTTTCTAA ACCGCACTCA TAATTTGCAG TCATTTTGAA 92 
AAGGAAGTCA TTATG 107 

(2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 76 amino acids 

(B) TYPE: amino acid 




(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 



15 



Met 


Leu Lys Asn Leu Ala Lys Leu Asp 


Gin Thr Glu Met Asp 


Lys 




5 


10 


15 


Val 


Asn Val Asp Leu Ala Ala Ala Gly 


Val Ala Phe Lys Glu 


Arg 




2a 


25 


30 


Tyr 


Asn Met Pro Val • He Ala Glu Ala 


Val Glu Arg Glu Gin 


Pro 


35" 


40 


45 


Glu 


His Leu Arg Ser Trp Phe Arg Glu 


Arg Leu He Ala His 


Arg 




50 


55 


60 


Leu 


Ala Ser Val Asn Leu Ser Arg Leu 


Pro Tyr Glu Pro Lys 


Leu 




65 


70 


75 


Lys 








76 









(2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 amino acids 

(B) TYPE: amino acid 
20 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 

Met Leu Lys Asn Leu Ala Lys Leu Asp Gin Thr Glu Met Asp Lys 
5 ■ 10 15 

2 5 Val Asn Val Asp Leu Ala Ala Ala Gly Val Ala Phe Lys Glu Ala 

20 25 30 

Tyr Asn Met Pro Val He Ala Glu Ala Val 
35 40 

(2) INFORMATION FOR SEQ ID NO: 34: 

3 0 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 57 bp 

(B) TYPE: nucleic acid 

(C) STRANDNESS: single 

(D) TOPOLOGY: linear 

3 5 (ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 
ATG CTG AAA AAC CTG GCT AAA CTG GAT CAG ACT GAA ATG GAT 42 
AAA GTT AAC GTT GAT 57 

(2) INFORMATION FOR SEQ ID NO: 35: 




(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 57 bp 

(B) TYPE: nucleic acid 

(C) STRANDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE. TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 

CTG GCT GCT GCT GGT'GTT GOT TIT AAA GAA CGT TAT AAC ATG 42 
CCG GTT ATT OCT GAA ' 57 

(2) INFORMATION FOR SEQ ID NO: 36: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 bp 

(B) TYPE: nucleic acid 

(C) STRANDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 
ATGATGAGGA GATTACATAT GCTGAAGAAT CTG 33 

(2) INFORMATION FOR SEQ ID NO: 37: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 56 bp 

(B) TYPE: nucleic acid 

(C) STRANDNESS: double 

(D) TOPOLOGY: hook 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: 

TTTCGGCTTAAGGAG 

TITGCCGAATTCCTCG^ 56 

(2) INFORMATION FOR SEQ ID NO: 38: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 137 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: » linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: 
Met Thr Ser Arg Arg Asp Trp Gin Leu Gin Gin Leu Gly He Thr 

5 10 15 
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Gin Trp Ser Leu Arg Arg Pro Gly Ala Leu Gin Gly Glu He Ala 
. 20 25 30 

He Ala He Pro Ala His Val Arg Leu Val Met Val Ala Asn Asp 
35 . AO 45 

5 Leu Pro Ala Leu Thr Asp Pro Leu Val Ser Asp Val Leu Arg Ala 

50 55 60 

Leti Thr Val Ser Pro Asp Gin Val Leu Gin Leu Thr Pro Glu Lys 
65 70 75 

lie Ala Met Leu Pro* Gin Gly Ser His Cys Asn Ser Trp Arg Leu 
10 80' 85 90 

Gly Thr Asp Glu Pro Leu Ser Leu Glu Gly Ala Gin Val Ala Ser 
95 100 105 

Pro Ala Leu Thr Asp Leu Arg Ala Asn Pro Thr Ala Arg Ala Ala 
110 115 120 

15 Leu Trp Gin Gin He Cys Thr Tyr Glu His Asp Phe Phe Pro Gly 

125 130 135 

Ash Asp 
137 

(2) INFORMATION FOR SEQ ID NO: 39: 

2 0 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 411 bp 

(B) TYPE: • nucleic acid 

(C) STRANDEDNESSS: single 

(D) TOPOLOGY: linear 

is (H) Molecule type: dna 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 

ATG ACA TCC CGA CGA GAC TGG CAG TTA . CAG CAA CTG GGC 39 

ATT ACC CAG TGG TCG CTG CGT CGC CCT GGC GCG TTG CAG 78 

GGC GAG ATT GCC ATT GOG ATC CCG GCA CAC GTC CGT CTG 117 

30 GTG ATG GTG GCA AAC GAT CTT CCC GCC CTG ACT GAT CCT 156 

TTA GTG AGC GAT GTT CTG CGC GCA TTA ACC GTC AGC CCC 195 

GAC CAG GTG CTG CAA CTG ACG CCA GAA AAA ATC GCG ATG 234 

CTG CCG CAA GGC AGT CAC TGC AAC AGT TGG CGG TTG GGT 273 

ACT GAC GAA CCG CTA TCA CTG GAA GGC GCT CAG GTG GCA 312 

3 5 TCA CCG GCG CTC ACC GAT TTA CGG GCA AAC CCA ACG GCA 351 

CGC GCC GCG TTA TGG CAA CAA ATT TGC ACA TAT GAA CAC 390 
GAT TTC TTC CCT GGA AAC GAC 411 

(2) INFORMATION FOR SEQ ID NO: 40: 

(i) SEQUENCE CHARACTERISTICS: 
40 (A) LENGTH: 77 bp 

(B) TYPE: nucleic acid 



r 
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(C) STRANDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40: 
GQCGATTATA GCCATATGIT GGCGCGGTA CGACGAATTT GCTATATTTG 50 
CGCCCCTGAC AACAGGAGCG ATTCGCT 77 



(2) INFORMATION FOR, SEQ ID NO: 41: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 103 bp 

(B) TYPE: nucleic acid 

(C) STRANDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41 : 
TGA TTTACCGGCA GCTTACCACA TTGAACAACG CGCCCACGCC 43 
titCCGTCGA GTGAAAAAAC GTTTGCCAGC AACCAGGGCG AGCGTTATCT 93 
CAAOITICAG 103 

(2) INFORMATION FOR SEQ ID NO: 42: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 bp 

(B) TYPE: nucleic acid 

(C) STRANDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42: 
GATTOCATAT GACATCCCGA CGAGACT 27 

(2) INFORMATION FOR SEQ ID NO: 43: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 bp 

(B) TYPE: nucleic acid 

(C) STRANDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 43: 
GACTGGATCC CTGCAGGCCG GTGAATGAGT 30 

(2) INFORMATION FOR SEQ ID NO: 44: 

(i) SEQUENCE CHARACTERISTICS: 




(A) LENGTH: 17 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

5 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 44: 

Let] Gly Thr Asp Glu Pro Leu Ser Leu Glu Glu Ala Gin Val Ala 
1 3 10 15 

Ser Pro 
17 

1 0 (2) INFORMATION FOR SEQ ID NO: 45: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

15 (ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 45: 

Ala Ala Leu Trp Gin Gin He Cys Thr Tyr Glu His Asp Phe Phe 

5 10 15 

Pro Ala 

2 0 17 

(2) INFORMATION FOR SEQ ID NO: 46: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 bp 

(B) TyPE: nucleic acid 

2 5 (C) STRANbNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 46: 
caAcAggAgc gAttccatat GACATCCCGACG 32 

3 0 (2) INFORMATION FOR SEQ ID NO: 47: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 bp 

(B) TYPE: nucleic acid 

(C) STRANDNESS: single 
3 5 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 47: 
GATTCGGATC CCTGCAGGCC GGTGAATGAG T 31 
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(2) INFORMATION FOR'SEQ ID NO: 48: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 bp 

(B) TYPE: nucleic acid 
5 (C) STRANDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 48: 
CCCCACATAT GAAAAACGCG ACGTTCTACC 30 

1 0 (2) INFORMATION FOR SEQ ID NO: 47: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 bp 

(B) TYPE: nucleic acid 

(C) STRANDNESS: single 
15 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 49: 

ACCCGGATCC AAACTGCCGG TGACATTC 28 

(2) INFORMATION FOR SEQ ID NO: 50: 

2 0 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 441 bp 

(B) TYPE: nucleic acid 

(C) STRANDNESS: single 

(D) TOPOLOGY: linear 

2 5 (Ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 50: 

ATG AAA AAC GCG ACG TTC TAC CTT CTG GAC AAT GAC ACC 39 

ACC GTC GAT GGC TTA AGC GCC GTT GAG CAC CTG GTG TGT 78 

GAA ATT GCC GCA GAA CGT TGG CGC AGC GGT AAG OGC GTG 117 

3 0 CTC ATC GCC TGr GAA GAT GAA AAG CAG GCT TAC GCC CTG 156 

GAT GAA GCC CTG TGG GCG CGT CCG GCA GAA AQG TTT GTT 195 

CCG CAT AAT TrA GCG GGA GAA GGA CCG CGC GGC GGT GTA 234 

CCG GTG GAG ATC GCC TGG CCG CAA AAG CGT AGC AGC AGC 273 

CGG CGC GAT ATA TTG ATT AGT CTG CGA ACA AGC TTT GCA 312 

3 5 GAT TTT GCC ACC GCT TTT ACA GAA GTG GTA GAC TTC GTT 351 

CCT CAT GAA GAT TCT CTG AAA CAA CTG GCG CGC GAA CGC 390 

TAT AAA GCC TAC CGC GTG GCT GGT TTC AAC CTG AAT ACG 429 
GCA ACT TGG AAA 441 

(2) INFORMATION FOR SEQ ID NO: 51 : 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 175 bp 

(B) TYPE: nucleic acid 

(C) STRANbNESS: single 
5 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5i: 
TAACGGCGAA GAGTAATTGC GTCAGGCAAG GCTGTTATTG CCGGATGCGG 50 
CGTGAACGCC TTATCCGACC TACACAGCAC TGAACTCGTA GGCCTGATAA 100 

1 0 GACACAACAG CGTCGCATCA GGCGCTGCGG TGTATACCTG ATGCGTATTT 150 

AAATCCACCA CAAGAAGCCC CATTr 175 

(2) INFORMATION FOR SEQ ID NO: 52: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 100 bp 
1 5 (B) TYPE: nucleic acid 

(C) STRANpNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 52: 
20 TAA TGGAAAA GACATATAAC CCACAAGATA TCGAACAGCC GCTTTACGAG 50 
CACTOGGAAA AAAGCCAGGA AAGTTTCTGC ATCATGATCC CGCCGOCGAA 100 

(2) INFORMATION FOR SEQ ID NO: 53: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 147 amino acids 

2 5 (B) TYPE: amino acid 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 53: 
Met Lys Asp Ala Thr Phe Tyr Leu Leu Asp Asn Asp Thr Thr Val 

3 0 5 10 15 

Asp Glv Leu Ser Ala Val Glu Gin Leu Val Cys Glu He Ala Ala 

20. 25 30 

Glu Arg Trp Arg Ser Gly Lys Arg Val Leu He Ala Cys Glu Asp 

35 40 45 

3 5 Glu Lys Gin Ala Tyr Arg Leu Asp Glu Ala Leu Trp Ala Arg Pro 

50 - 55 60 

Ala Glu Ser Phe Val Pro His Asn Leu Ala Gly Glu Gly Pro Arg 

65 70 75 

Gly Gly Ala Pro Val Glu He Ala Trp Pro Gin Lys Arg Ser Ser 
40 80 85 90 
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Ser Arg Arg Asp lie Leu He Ser Leu Arg Thr Ser Phe Ala Asp 
95 100 105 

Phe Ala Thr Ala Phe Thr Glu Val Val Asp Phe Val Pro Tyr Glu 
ilO 115 120 

5 Asp Ser Leu Lys Gin Leu Ala Arg Glu Arg Tyr Lys Ala Tyr Arg 

125 130 135 

Val Ala Gly Phe Asn Leu Asn Thr Ala Thr Trp Lys 
140 145 147 

(2) INFORMATION FOR SEQ ID NO: 54: 
1 0 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

1 5 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 54: 

Met Lys Asn Ala Thr Phe Tyr Leu Leu Asp Asn Asp Thr Thr Val 
5 ' 10 15 

Asp Gly Leu Ser Ala Val Glu Gin Leu Val Xxx Glu He Ala 

20 25 

2 0 (2) INFORMATION FOR SEQ ID NO: 55: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

25 (ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 55: 

Val Leu He Ala Xxx Glu Asp Glu Lys 
5 

(2) INFORMATION FOR SEQ ID NO: 56: 

3 0 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 amino acids 

(B) TYPE: amino acid 
(D)TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

3 5 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 56: 

Leu Asp Glu Ala Leu Trp Ala "Ala Pro Ala Glu Ser Phe Val Pro 
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(2) INFORMATION FOR. SEQ ID NO: 57: 

(i) SEQUENCE CHARACTERISTICS: 

(AjLENdfH! 16 arnlho acids 
(B)tYpli attilhd acid 
5 (DJTOfbLOGY: llhdar 

(ii) MOLECULE fYPEt peptide 

(xi) SEQUENCE DESCRIPTION: SEO ID NO: 57: 

Gly Gly Ala Pro Val'Glu lie Ala Trp Pro 

5' 10 

1 0 (2) INFORMATION FOR SEQ ID NO: 58: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: lihear 

15 (ii) MOLECULE TYPE! peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 58: 

Gly Phe Ash Leu Asn Thr Ala Thr 
5 

20 (2) INFORMATION FOR SEQ ID NO: 59: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 bp 

(B) tYFE: . nucleic acid 

(C) STRANDNESS: single 

2 5 (D) TOPOLOGY: linear 

(ii) MOLECULE IYpE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 59: 
OGXXACATAT GAAAAACGOG ACGTTCTACC 30 

(2) INFORMATION FOR SEQ ID NO: 60: 

30 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 bp 

(B) TYPE: ' nucleic acid 

(C) STRANDNESS: single 

(D) TOPOLOGY: linear 

3 5 (ii) MOLECULE TYPE: DNA 

M\ .QFftl ipMp.p npsr.RiPTinM- rpd in Km- «n- 



